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To My Parents 


Preface to the First Edition 


This book is composed of two parts: Part I (Chaps. 1-3) is an introduction to 
tensors and their physical applications, and Part II (Chaps. 4—6) introduces group 
theory and intertwines it with the earlier material. Both parts are written at the 
advanced undergraduate/beginning graduate level, although in the course of Part II 
the sophistication level rises somewhat. Though the two parts differ somewhat in 
flavor, I have aimed in both to fill a (perceived) gap in the literature by connecting 
the component formalisms prevalent in physics calculations to the abstract but more 
conceptual formulations found in the math literature. My firm belief is that we need 
to see tensors and groups in coordinates to get a sense of how they work, but also 
need an abstract formulation to understand their essential nature and organize our 
thinking about them. 

My original motivation for the book was to demystify tensors and provide a 
unified framework for understanding them in all the different contexts in which they 
arise in physics. The word tensor is ubiquitous in physics (stress tensor, moment 
of inertia tensor, field tensor, metric tensor, tensor product, etc.) and yet tensors are 
rarely defined carefully, and the definition usually has to do with transformation 
properties, making it difficult to get a feel for what these objects are. Furthermore, 
physics texts at the beginning graduate level usually only deal with tensors in 
their component form, so students wonder what the difference is between a second 
rank tensor and a matrix, and why new, enigmatic terminology is introduced for 
something they’ve already seen. All of this produces a lingering unease, which I 
believe can be alleviated by formulating tensors in a more abstract but conceptually 
much clearer way. This coordinate-free formulation is standard in the mathematical 
literature on differential geometry and in physics texts on General Relativity, but, as 
far as I can tell, is not accessible to undergraduates or beginning graduate students 
in physics who just want to learn what a tensor is without dealing with the full 
machinery of tensor analysis on manifolds. 

The irony of this situation is that a proper understanding of tensors doesn’t 
require much more mathematics than what you likely encountered as an under- 
graduate. In Chap. 1, I introduce this additional mathematics, which is just an 
extension of the linear algebra you probably saw in your lower division coursework. 
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This material sets the stage for tensors and hopefully also illuminates some of the 
more enigmatic objects from quantum mechanics and relativity, such as bras and 
kets, covariant and contravariant components of vectors, and spherical harmonics. 
After laying the necessary linear algebraic foundations, we give in Chap. 2 the 
modern (component-free) definition of tensors, all the while keeping contact with 
the coordinate and matrix representations of tensors and their transformation laws. 
Applications in classical and quantum physics follow. 

In Part II of the book, I introduce group theory and its physical applications, 
which is a beautiful subject in its own right and also a nice application of the material 
in Part I. There are many good books on the market for group theory and physics (see 
the references), so rather than be exhaustive, I have just attempted to present those 
aspects of the subject most essential for upper division and graduate level physics 
courses. In Chap. 4, I introduce abstract groups but quickly illustrate that concept 
with myriad examples from physics. After all, there would be little point in making 
such an abstract definition if it didn’t subsume many cases of interest! We then 
introduce Lie groups and their associated Lie algebras, making precise the nature of 
the symmetry “generators” that are so central in quantum mechanics. Much time is 
also spent on the groups of rotations and Lorentz transformations, since these are so 
ubiquitous in physics. 

In Chap. 5, I introduce representation theory, which is a mathematical formaliza- 
tion of what we mean by the “transformation properties” of an object. This subject 
sews together the material from Chaps. 2 and 3 and is one of the most important 
applications of tensors, at least for physicists. Chapter 6 then applies and extends 
the results of Chap. 5 to a few specific topics: the perennially mysterious “spherical” 
tensors, the Wigner—Eckart theorem, and Dirac bilinears. These topics are unified 
by the introduction of the representation operator, which is admittedly somewhat 
abstract but neatly organizes these objects into a single mathematical framework. 

This text aims (perhaps naively!) to be simultaneously intuitive and rigorous. 
Thus, although much of the language (especially in the examples) is informal, 
almost all the definitions given are precise and are the same as one would find in a 
pure math text. This may put you off if you feel less mathematically inclined; I hope, 
however, that you will work through your discomfort and develop the necessary 
mathematical sophistication, as the results will be well worth it. Furthermore, if you 
can work your way through the text (or at least most of Chap. 5), you will be well 
prepared to tackle graduate math texts in related areas. 

As for prerequisites, it is assumed that you have been through the usual under- 
graduate physics curriculum, including a “mathematical methods for physicists” 
course (with at least a cursory treatment of vectors and matrices) as well as the 
standard upper division courses in classical mechanics, quantum mechanics, and 
relativity. Any undergraduate versed in those topics, as well as any graduate student 
in physics, should be able to read this text. To undergraduates who are eager to learn 
about tensors but haven’t yet completed the standard curriculum, I apologize; many 
of the examples and practically all of the motivation for the text come from those 
courses, and to assume no knowledge of those topics would preclude discussion 
of the many “examples” that motivated me to write this book in the first place. 
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However, if you are motivated and willing to consult the references, you could 
certainly work through this text and would no doubt be in excellent shape for those 
upper division courses once you take them. 

Exercises and problems are included in the text, with exercises occurring within 
the chapters and problems occurring at the end of each chapter. The exercises in 
particular should be done as they arise, or at least carefully considered, as they often 
flesh out the text and provide essential practice in using the definitions. Very few of 
the exercises are computationally intensive, and many of them can be done in a few 
lines. They are designed primarily to test your conceptual understanding and help 
you internalize the subject. Please don’t ignore them! 

Besides the aforementioned prerequisites, I’ve also indulged in the use of some 
very basic mathematical shorthand for brevity’s sake; a guide is below. Also, be 
aware that for simplicity’s sake, I’ve set all physical constants such as c and Å equal 
to 1. Enjoy! 
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While teaching courses based on the first edition, I found myself augmenting the 
material in the text in various ways and thought that these additions might make the 
text a bit more user friendly. This was the motivation to undertake a second edition, 
which in addition to these significant additions also includes the usual correction of 
typos and other minor improvements. The major new elements are as follows: 


More motivation. One well-received feature of the first edition was the intro- 
ductory chapter on tensors, which quickly and intuitively conveys some of the 
main points and sets the stage for the more detailed treatment that follows. Such 
motivation was conspicuously absent in Part II of the book, so I have added 
introductory sections to these later chapters, with a similar aim: to convey the 
take-home messages as directly as possible, leaving the details and secondary 
examples for the remainder of the chapter. I have also added an epilogue to 
Chap. 2 which takes stock of the various mathematical structures built up in that 
chapter, in the hope of coherently organizing them. 


More figures and tables. The theory of Lie groups is at heart a geometric one, 
but making the geometric picture precise requires the machinery of differential 
geometry, which I very specifically wished to avoid in this book.! That, however, 
is no reason to omit the actual pictures one should have in mind when thinking 
about Lie groups, and so those figures, along with several others, are now 
included in the text. ve also included more tables in the representation theory 
chapter, for help in organizing the menagerie of representations that arise in 
physics. 


More varied formatting. In teaching from the first edition, I also perceived an 
opportunity to make the visual format of the text more expressive of the content. 
To this end, I have taken some of the punch lines of various sections, which 


'This is consonant with the approach taken by Hall [11]. There are, of course, many excellent 
books which do take the geometric approach, which for the committed theoretical physicist is 
undoubtedly the right one; see, e.g., Frankel [6] and Schutz [18]. 
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previously had been bold but in-line in the text, and have separated them from 
the main text for emphasis. I have also introduced some boxed text for important 
side discussions to emphasize that they are a departure from the main storyline 
but still significant. Less important departures are relegated to footnotes (which 
are frequent). 


In addition to the above, there have been many smaller improvements and 
corrections, many of which were pointed out or suggested by students and other 
readers. Thanks are due to all the Phys 198 students at UC Berkeley for these and 
for enduring my various pedagogical experiments, both with this book and in the 
classroom. Thanks are also due to Roger Berlind and Prof. John Colarusso, who 
read the text extremely closely and provided detailed feedback through extended 
correspondence. Their interest and attention have been gratifying, as well as 
extremely helpful in improving the book. 

Valuable and detailed feedback on the first edition of this book were given 
by Hal Haggard, Mark Moriarty, Albert Shieh, Felicitas Hernandez, and Emily 
Rauscher. Early mentorship and support were given by professors Robert Penner 
and Ko Honda of the U.S.C. mathematics department, whose encouragement was 
instrumental in my pursuing mathematics and physics to the degree that I have. 
Thanks are also due to my colleagues past and present at Birkhauser, for taking a 
chance on this young author with the first edition and supporting his ambitions for 
a second. 

Finally, I must again thank my family, which has now grown to include my 
partner Erika, and our children Seamus and Samina; they are the reason. 


Berkeley, CA, USA Nadir Jeevanjee 


Notation 


Some Mathematical Shorthand 


The set of natural numbers (positive integers) 

The set of positive and negative integers 

The set of real numbers 

The set of complex numbers 

“is an element of”, “an element of”, i.e. 2 € R reads “2 is an 
element of the real numbers” 
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“is not an element of” 
“for all” 
“is a subset of”, “a subset of” 
Denotes a definition 
:A>B Denotes a map f that takes elements of the set A into 
elements of the set B 
:atb Indicates that the map f sends the element a to the 
element b 
Denotes a composition of maps, i.e. if f : A > B and 
g:B—C,then fog: A —> C is given by 
(f ° g)(a) = f(g(@)) 
AxB The set {(a, b)} of all ordered pairs where a € A, b € B. 
Referred to as the cartesian product of sets A and B. 
Extends in the obvious way to n-fold products A; x... An. 


|o wK NAIA 


[0] 


R” Rx...xR 
ee eee 
n times 
Cc” Cx...xC 
en m 
n times 
{A|Q} Denotes a set A subject to condition Q. For instance, the set 


of all even integers can be written as {x € R|x/2 € Z} 
Oo Denotes the end of a proof or example 
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Dirac Dictionary 


We summarize here all of the translations given in the text between quantum 
mechanical Dirac notation and standard mathematical notation. 


Standard Notation Dirac Notation 

Vector Y € H IY) 

Dual vector L(y) (y 

Inner product (Y|) (yle) 

AY), A€ LH) Aly) 

(Y, Ad), (VAI) 

T, i @e; Yo Tyli 
i,j 

ei Be; li) |J) or li, 7) 
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Part I 
Linear Algebra and Tensors 


Chapter 1 
A Quick Introduction to Tensors 


The reason tensors are introduced in a somewhat ad-hoc manner in most physics 
courses is twofold: first, a detailed and proper understanding of tensors requires 
mathematics that is slightly more abstract than the standard linear algebra and vector 
calculus that physics students use everyday. Second, students don’t necessarily need 
such an understanding to be able to manipulate tensors and solve problems with 
them. The drawback, of course, is that many students feel uneasy with tensors; they 
can use them for computation but don’t have an intuitive feel for what they’re doing. 
One of the primary aims of this book is to alleviate that unease. Doing that, however, 
requires a modest investment (about 30 pages) in some abstract linear algebra, so 
before diving into the details we’ll begin with a rough overview of what a tensor 
is, which hopefully will whet your appetite and tide you over until we can discuss 
tensors in full detail in Chap. 3. 

Many older books define a tensor as a collection of objects which carry indices 
and which “transform” in a particular way specified by those indices. Unfortunately, 
this definition usually doesn’t yield much insight into what a tensor is. One of the 
main purposes of the present text is to promulgate the more modern definition of 
a tensor, which is equivalent to the old one but is more conceptual and is in fact 
already standard in the mathematics literature. This definition takes a tensor to be a 
function which eats a certain number of vectors (known as the rank r of the tensor) 
and produces a number. The distinguishing characteristic of a tensor is a special 
property called mutltilinearity, which means that it must be linear in each of its r 
arguments (recall that linearity for a function with a single argument just means 
that T(v + cw) = T(v) + cT(w) for all vectors v and w and numbers c). As we 
will explain in a moment, this multilinearity enables us to express the value of the 
function on an arbitrary set of r vectors in terms of the values of the function on 
r basis vectors like X, y, and z. These values of the function on basis vectors are 
nothing but the familiar components of the tensor, which in older treatments are 
usually introduced first as part of the definition of the tensor. 

We’ll make this concrete by considering a couple of extended examples. 
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Example 1.1. The Levi-Civita symbol and the volume tensor € on R? 


You may have encountered the Levi—Civita symbol in coursework in classical or 
quantum mechanics. We will denote it here! as č; jk, Where the indices 7, j, and k 
range from 1 to 3. The symbol takes on different numerical values depending on the 
values of the indices, as follows: 


0 unlessi 4 j Fk 
Ejk = ) +1 if {i, j,k} = {1,2,3}, {2,3, 1}, or {3,1,2} (1.1) 
—1 if {i, j,k} = {3,2,1}, {1,3,2}, or {2, 1, 3}. 


Sometimes one sees this defined in words, as follows: €;j, = 1if {i, j,k} is a “cyclic 
permutation” of {1,2,3}, —1 if {i, j, k} is an “anti-cyclic permutation” of {1, 2, 3}, 
and 0 otherwise. 

The Levi-Civita symbol is usually introduced to physicists as a convenient 
shorthand that simplifies expressions and calculations; for instance, it allows one 
to write a simple expression for the components of the cross product of two vectors 
v and w: 


3 
(v x w)! = 5 env wt. 


jk=1 


It also allows for a compact expression of the quantum-mechanical angular momen- 
tum commutation relations: 


3 
[L;,L;] = X iGijnLe- 
k=1 


Despite its utility, however, the Levi—Civita symbol is rarely given any mathematical 
or physical interpretation, and (like tensors more generally) ends up being some- 
thing that students know how to use but don’t have a feel for. In this example we’ ll 
show how our new point of view on tensors sheds considerable light on both the 
mathematical nature and the geometric interpretation of the Levi-Civita symbol. 

To begin, let’s define a rank-three tensor, denoted €, where € eats three vectors 
u, v, and w and produces a number €(u, v, w). We'd like to interpret €(u, v, w) as the 
(oriented) volume of the parallelepiped spanned by u, v, and w; see Fig. 1.1. From 
vector calculus we know that we can accomplish this by defining 


€(u, v,w) = (u x v) - w. (1.2) 


Usually the “—” across the top is omitted, but we will need it to conceptually distinguish the 
Levi-Civita symbol from the epsilon tensor defined below. 
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Fig. 1.1 The parallelepiped 
spanned by u, v, and w 


For e to really be a tensor, however, it must be multilinear, i.e. linear in each 
argument. This means 


€(uy + Cuz, v,W) = e(ui, v, W) + ce(u2, v, w) (1.3a) 
€(u, vı +Cv2,w) = E(u, v1, w) + ce(u, V2, W) (1.3b) 
€(u, Vv, wi + CW2) = E(u, v, W1) + ce(u, v, w2) (1.3c) 


for all numbers c and vectors u, v, w, v1, etc. Let’s check that (1.3a) holds: 


€(uy + Cuz, v, w) = ((u + cu) x v) Ww 
= (u Xv + cum Xv)-w 


= (u Xv) -w+c(u2 X v)-w 


II 


e(ui, v, w) + ce(u2, v, w). 


Thus € really is linear in the first argument. The check for (1.3b) and (1.3c) proceeds 
similarly and is left as an exercise. Thus, € satisfies our definition of a tensor as a 
multilinear function. But, how does this square with our usual notion of a tensor as 
a set of numbers with some specified transformation properties? We claimed above 
that these numbers, known as the components of a tensor, are nothing but the tensor 
evaluated on sets of basis vectors. So, let’s evaluate € on three arbitrary basis vectors. 
Ordinarily, a basis vector is one of either x, y, or 2, but for the purposes of this 
example it will be easier to call these e1, e2, and e3, respectively. We’ll arbitrarily 
choose three of them (since € is rank three) and call these choices e;, ej, and ex, 
where it’s possible that i, j, and k are not all distinct. Then we leave it as an exercise 
for you to check, using (1.2), that 


0 unlessi 4 j #k 
e(ei ej, ek) = 3 +1 if ti, j,k} = {1,2,3}, {2,3,1}, or {3, 1, 2} (1.4) 
—1 if {i, j,k} = {3,2,1}, {1,3,2}, or {2, 1, 3}. 
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If, as mentioned above, we then define the components of the € tensor to be the 
numbers 


Eijk = €(ei, €j, €k), (1.5) 


then (1.4) tells us that the components of the € tensor are nothing but the Levi-Civita 
symbol! This is a major shift in perspective, and tells us several things. First: 


1. The Levi-Civita symbol is not merely a mathematical convenience or short- 
hand; it actually represents the components of a tensor, the volume tensor (or 
Levi-Civita tensor). 


Furthermore, Eq. (1.5) tells us that: 


2. The components of a tensor are just the values of the tensor evaluated on 
a corresponding set of basis vectors. 


Combining 1 and 2 above then gives the following geometric interpretation of the 
Levi-Civita symbol: 


3. €ijk is the volume of the oriented parallelepiped spanned by e;, ej, and ex. 


Another property worth noting is that the definition (1.2) of € does not require us to 
choose a basis for our vector space. This makes sense, because € computes volumes 
of parallelepipeds, which are geometrical quantities which exist independently of 
any basis. We can thus add a fourth observation to our list: 


4. The <€ tensor exists independently of any basis. This is in contrast to its 
components, which by (1.5) are manifestly basis-dependent. 


While all this may illuminate the nature of the Levi-Civita symbol, and tensors 
more generally, we still don’t know that the €;;, as defined here “transform” in the 
manner specified by the usual definition of a tensor. We’ll see how this works in our 
next example, that of a generic rank-two tensor. 


Exercise 1.1. Complete the proof of multilinearity by verifying (1.3b) and (1.3c), using 
the definition (1.2). Also use (1.2) to verify (1.4). 


Example 1.2. A generic rank-two tensor 


In this example we’ ll analyze a generic rank-two tensor, using our modern definition 
of a tensor as a multilinear function. This new viewpoint will clear up some of the 
pervasive and perennial confusion related to tensors, as we’ll see. 

Consider a rank-two tensor T, whose job it is to eat two vectors v and w and 
produce a number T (v, w). In analogy to (1.3), multilinearity for this tensor means 


T(v, + cv2,w) = T (v1, w) + cT (v2, w) 
(1.6) 
T(v, w; + cw2) = T (v, w1) + cT(v, w2) 


for any number c and all vectors v and w. An important consequence of multilin- 
earity is that if we have a coordinate basis for our vector space, say X, y, and z, then 
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T is determined entirely by its components, i.e. its values on the basis vectors, as 
follows: first, expand v and w in the coordinate basis as 


Then we have 


T(v,w) = T(vyX + v,f + 0:2, WeX + w,Ẹ + WZ) 
= vT ($, wx + wyy + w:2) + v TẸ, weX + wyf + w2) 
+v:T (2, wx + wyy + w2) 
= UyWyT (XK, &) + vew, TR, ĵ) + vxw: TR, 2) + vyw TÈ, &) + vw TO, ¥) 
+vyw:T (f, 2) + vwx T (2,8) + vwy T (2, Ẹ) + vew: T (2, 2). 


As in the previous example, we define the components of T as 
Tx: = T, x), Try = T (x, f), Tyx = T, x), (1.7) 
and so on. This then gives 


T(v, w) = VxWx Tyx + UxWy Txy + vxw: Ty, + VyWx Ty x + VyWy Tyy 


+ VywTyz + UW Tz, + VWyTzy + 0W-T zz, (1.8) 


which may look familiar from discussion of tensors in the physics literature. In that 
literature, the above equation is often part of the definition of a 2nd rank tensor; 
here, though, we see that its form is really just a consequence of multilinearity. 
Another advantage of our approach is that the components {7yx, Try, Tx,...} of 
T have a meaning beyond that of just being coefficients that appear in expressions 
like (1.8); Eq. (1.7) again shows that components are the values of the tensor 
when evaluated on a given set of basis vectors. We re-emphasize this fact because 
it is crucial in getting a feel for tensors and what they mean. 

Another nice feature of our definition of a tensor is that it allows us to derive 
the tensor transformation laws which historically were taken as the definition of a 
tensor. Say we switch to a new set of basis vectors {x’, y’,z’} which are related to 
the old basis vectors by 


x = AyyX + Axy + Axi 
y = AyrxX + Ayy + Ay (1.9) 
WY = Agy& + Ayy¥ + Aye. 


This does not affect the action of T, since T exists independently of any basis, 
but if we’d like to compute the value of T(v, w) in terms of the new components 
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(vi, vi, vt) and (wi, w, w) of v and w, then we’ll need to know what the new 
components {Tx x, Twy, Tey, ...} look like. Computing Txw, for instance, gives 


Tey = TR, 8) 
= T(Awxå + Axy + Agr, Axx + Awy + Aw?) 
= Axx Axx Tex + AxtyAxtxT yx + AxtzAx'xTex + AxtxAx'yTry + 
Axy Axy Tyy + AxtzAxtyTzy + Axx Ax'zTxz + Axy AxtzTyz + AxtzAxzT zz. 
(1.10) 


You probably recognize this as an instance of the standard tensor transformation 
law, which used to be taken as the definition of a tensor. Here, the transformation 
law is another consequence of multilinearity. In Chap. 3 we will introduce more 
convenient notation which will allow us to use the Einstein summation convention, 
so that we can write the general form of (1.10) as 


Tj = Al, A! Tki, 


a form which may be more familiar to some readers. 

One common source of confusion is that in physics textbooks, tensors (usually 
of the 2nd rank) are often represented as matrices, so then the student wonders what 
the difference is between a matrix and a tensor. Above, we have defined a tensor as 
a multilinear function on a vector space. What does that have to do with a matrix? 
Well, if we choose a basis {X, y, z}, we can then write the corresponding components 
{Txx, Txy, Tez ...} in the form of a matrix [T] as 


Tx Try Tx 
[T] = Tyx Tyy Ty; 
te Eg Te 


Equation (1.8) can then be written compactly as 


Tex Tey Tz Wx 
Tv, w) = (Vx, vy, v) | Tyx Ty Tye | | wy J (1.11) 
Tx Tey Tz Wz 


where the usual matrix multiplication is implied. Thus, once a basis is chosen, the 
action of T can be neatly expressed using the corresponding matrix [T]. It is crucial 
to keep in mind, though, that this association between a tensor and a matrix depends 
entirely on a choice of basis, and that [T] is useful mainly as a computational tool, 
not a conceptual handle. T is best thought of abstractly as a multilinear function, 
and [T] as its representation in a particular coordinate system. 

One possible objection to our approach is that matrices and tensors are often 
thought of as linear operators which take vectors into vectors, as opposed to objects 
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which eat vectors and spit out numbers. It turns out, though, that for a 2nd rank 
tensor these two notions are equivalent. Say we have a linear operator R; then we 
can turn R into a 2nd rank tensor Tr by 


Tr(v,w) =v- Rw, (1.12) 


where - denotes the usual dot product of vectors. You can roughly interpret this 
tensor as taking in v and w and giving back the “component of Rw lying along v”. 
You can also easily check that Tr is multilinear, i.e. that it satisfies (1.6). If we 
compute the components of Tg we find that, for instance, 


(Tr) xx = Tr(X, x) 
- RÌ 


ll 
>> 


II 


- (Rxx + Rys + Red) 


x 
= Rxx 


so the components of the tensor Tr are the same as the components of the linear 
operator R! In components, the action of Tg then looks like 


Ry Ryy Rxz Wx 
Tr(v, w) = (Ux, Vy, Uz) | Ryx Ryy Ryz Wy |, 
Rx Rzy Rz Wz 


which is identical to (1.11). This makes it obvious how to turn a linear operator R 
into a 2nd rank tensor—just sandwich the component matrix in between two vectors! 
This whole process is also reversible: we can turn a 2nd rank tensor T into a linear 
operator Rr by defining the components of the vector Rp (v) as 


(Rr(v))x = TR, v), (1.13) 


and similarly for the other components. One can check (see Exercise 1.2 below) 
that these processes are inverses of each other, so this sets up a one-to-one 
correspondence between linear operators and 2nd rank tensors and we can thus 
regard them as equivalent. Since the matrix form of both is identical, one often 
does not bother trying to clarify exactly which one is at hand, and often times it is 
just a matter of interpretation. 

How does all this work in a physical context? One nice example is the rank 
2 moment-of-inertia tensor Z, familiar from rigid body dynamics in classical 
mechanics. In most textbooks, this tensor usually arises when one considers the 
relationship between the angular velocity vector œ and the angular momentum L 
or kinetic energy KE of a rigid body. If one chooses a basis {x,y,z} and then 
expresses, say, KE in terms of œw, one gets an expression like (1.8) with v = w = œ 
and T;; = Tij. From this expression, most textbooks then figure out how the Z;; 
must transform under a change of coordinates, and this behavior under coordinate 
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change is then what identifies the Z;; as a tensor. In this text, though, we will take a 
different point of view. We will define Z to be the function which, for a given rigid 
body, assigns to a state of rotation (described by the angular momentum vector œ) 
twice the corresponding kinetic energy KE. One can then calculate K E in terms of 
@, which yields an expression of the form (1.8); this then shows that Z is a tensor in 
the old sense of the term, which also means that it’s a tensor in the modern sense of 
being a multilinear function. We can then think of Z as a 2nd rank tensor defined by 


1 
KE= zl lo, @). 


From our point of view, we see that the components of Z have a physical meaning; 
for instance, Z,, = Z(X, Ñ) is just twice the kinetic energy of the rigid body when 
@ = x. When one actually needs to compute the kinetic energy for a rigid body, 
one usually picks a convenient coordinate system, computes the components of Z 
using the standard formulae,” and then uses the convenient matrix representation to 
compute 


1 Tx Tey Lez Wx 
KE= 5 x: Wy, @z) | Lyx Lyy Tyz wy |> 
Lex Ley Tz Wz 


which is the familiar expression often found in mechanics texts. 
As mentioned above, one can also interpret Z as a linear operator which takes 
vectors into vectors. If we let Z act on the angular velocity vector, we get 


Lex Tey Tyz Wx 
Lyx Lyy Ly; Wy | > 
Lex Ley Le, Wz 


which you probably recognize as the coordinate expression for the angular momen- 
tum L. Thus, Z can be interpreted either as the rank 2 tensor which eats two copies 
of the angular velocity vector œ and produces the kinetic energy, or as the linear 
operator which takes œw into L. The two definitions are equivalent. 

Many other tensors which we’ll consider in this text are also nicely understood 
as multilinear functions which happen to have convenient component and matrix 
representations. These include the electromagnetic field tensor of electrodynamics, 
as well as the metric tensors of Newtonian and relativistic mechanics. As we 
progress we’ll see how we can use our new point of view to get a handle on these 
usually somewhat enigmatic objects. 


Exercise 1.2. Verify that the definitions in (1.12) and (1.13) invert each other. Do this by 
considering the tensor Tg corresponding to a linear operator R, and then the linear operator 
Rr, corresponding to Tr. Show that Rr, = R by feeding in a vector v on both sides and 
taking a particular (say, x) component of the result. 


?See Example 3.14 or any standard textbook such as Goldstein [8]. 


Chapter 2 
Vector Spaces 


Since tensors are a special class of functions defined on vector spaces, we must have 
a good foundation in linear algebra before discussing them. In particular, you’ ll 
need a little bit more linear algebra than is covered in most sophomore or junior 
level linear algebra/ODE courses. This chapter starts with the familiar material 
about vectors, bases, linear operators, etc., but eventually moves on to slightly 
more sophisticated topics such as dual vectors and non-degenerate Hermitian forms, 
which are essential for understanding tensors in physics. Along the way we’ll also 
find that our slightly more abstract viewpoint clarifies the nature of many familiar 
but enigmatic objects, such as spherical harmonics, bras and kets, contravariant and 
covariant indices, and the Dirac delta function. 


2.1 Definition and Examples 


We begin with the definition of an abstract vector space. We’re taught as undergrad- 
uates to think of vectors as arrows with a head and a tail, or as ordered triples of real 
numbers. However, physics (and especially quantum mechanics) requires a more 
abstract notion of vectors. Before reading the definition of an abstract vector space, 
keep in mind that the definition is supposed to distill all the essential features of 
vectors as we know them (like addition and scalar multiplication) while detaching 
the notion of a vector space from specific constructs, like ordered n-tuples of real 
or complex numbers (denoted as R” and C” respectively). The mathematical utility 
of this is that much of what we know about vector spaces depends only on the 
essential properties of addition and scalar multiplication, not on other properties 
particular to R” or C”. If we work in the abstract framework and then come across 
other mathematical objects that don’t look like R” or C” but that are abstract 
vector spaces, then most everything we know about R” and C” will apply to these 
spaces as well. Physics also forces us to use the abstract definition since many 
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quantum-mechanical vector spaces are infinite-dimensional and cannot be viewed 
as C” or R” for any n. An added dividend of the abstract approach is that we will 
learn to think about vector spaces independently of any basis, which will prove very 
useful. 

That said, an (abstract) vector space consists of the following: 


e A set V (whose elements are called vectors) 

e A set of scalars C (which for us will always be either R or C) 

e Operations of addition and scalar multiplication under which the vector space is 
closed,! and which satisfy the following axioms: 


v+w=w+v forallv,winV (Commutativity) 
.v+(wt+x)=(v+w)+x forall v,w,xin V (Associativity) 

. There exists a vector 0 in V such that v + 0 = v for all v in V 

. For all v in V there is a vector —v such that v + (—v) = 0 

c(v + w) = cv + cw for all v and w in V and scalars c (Distributivity) 
lv =v_ forallvinV 

. (c1 + c&2)v = civ + cav forall scalars c1, c2 and vectors v 

. (cıc2)v = cı(c2v) forall scalars c1, c2 and vectors v 


CIDMNRWNH 


Some parts of the definition may seem tedious or trivial, but they are just meant 
to ensure that the addition and scalar multiplication operations behave the way we 
expect them to. In determining whether a set is a vector space or not, one is usually 
most concerned with defining addition in such a way that the set is closed under 
addition and that axioms 3 and 4 are satisfied; most of the other axioms are so natural 
and obviously satisfied that one, in practice, rarely bothers to check them.” That said, 
let’s look at some examples from physics, most of which will recur throughout the 
text. 


Example 2.1. R” 


This is the most basic example of a vector space, and the one on which the abstract 
definition is modeled. Addition and scalar multiplication are defined in the usual 
way: for v = (vi, V, v”), w= (w!, w?, ..., wW”) in R”, we have 


2 


(07... 0") + Ww wW) = (v! 4 v? w, u w"), (21) 


l Meaning that these operations always produce another member of the set V, i.e. a vector. 


? Another word about axioms 3 and 4, for the mathematically inclined (feel free to skip this if you 
like): the axioms don’t demand that the zero element and inverses are unique, but this actually 
follows easily from the axioms. If 0 and 0’ are two zero elements, then 


0=04+0=0, 


and so the zero element is unique. Similarly, if —v and —v’ are both inverse to some vector v, then 


—v = =v +0=-0'4+ (v — v) = (~v + v) — v = 0 — v = -2, 


and so inverses are unique as well. 
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and 
e(v!, v?,..., v”) = (cv!,cv?,..., cv”), (2.2) 


and you should check that the axioms are satisfied. These spaces, of course, are 
basic in physics; R? is our usual three-dimensional cartesian space, R* is spacetime 
in special relativity, and R” for higher n occurs in classical physics as configuration 
spaces for multiparticle systems (e.g., R is the configuration space in the classic 
two-body problem, as you need six coordinates to specify the position of two 
particles in three-dimensional space). 


Example 2.2. C” 


This is another basic example—addition and scalar multiplication are defined 
by (2.1) and (2.2), just as for IR”, and the axioms are again straightforward to verify. 
Note, however, that we can take C” to be a complex vector space (i.e., we may take 
the set C in the definition to be C), since the right-hand side of (2.2) is guaranteed 
to be in C” even when c is complex. The same is not true for R”, which is why R” is 
only a real vector space. This seemingly pedantic distinction can often end up being 
significant, as we’ll see. 

As for physical applications, C” occurs in physics primarily as the ket space 
for finite-dimensional quantum-mechanical systems, such as particles with spin but 
without translational degrees of freedom. For instance, a spin 1/2 particle fixed in 
space has ket space identifiable with C?, and a more general fixed particle with spin 
s has ket space identifiable with C?**!, 


Box 2.1 C” as a Real Vector Space 

Note that for C” we were not forced to take C = C; in fact, we could have 
taken C = R since (2.2) certainly makes sense when c € R. In this case we 
would write the vector space as Ch, to remind us that we are considering C” 
as a real vector space. It may not be obvious what the difference between C” 
and Cf is, since both consist of n-tuples of complex numbers, nor may it be 
obvious why anyone would take C = R when one could take C = C. We will 
answer both these questions when we consider bases and dimension in the next 
section. As a matter of course, though, you should assume that we take C = C 
if possible, unless explicitly stated otherwise. 


Example 2.3. M, (R) and M,,(C), n x n matrices with real or complex entries 


The vector space structure of (IR) and M,,(C) is similar to that of R” and C”: 
denoting the entry in the ith row and jth column of a matrix A as A;;, we define 
addition and (real) scalar multiplication for A, B € M,,(IR) by 


(A + B)ij = Aij + Bij 
(cA)ij = cAij 
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i.e. addition and scalar multiplication are done component-wise. The same defini- 
tions are used for M, (C), which can of course be taken to be a complex vector space. 
You can again check that the axioms are satisfied. Though these vector spaces don’t 
appear explicitly in physics very often, they have many important subspaces, one of 
which we consider in the next example. 


Example 2.4. H,,(C), n x n Hermitian matrices with complex entries 


HL, (C), the set of all n x n Hermitian matrices,’ is obviously a subset of M,,(C), 
and in fact it is a subspace of M,,(C) in that it forms a vector space itself. To 
show this it is not necessary to verify all of the axioms, since most of them are 
satisfied by virtue of H,(C) being a subset of M,,(C); for instance, addition and 
scalar multiplication in H,(C) are just given by the restriction of those operations 
in M, (C) to H,(C), so the commutativity of addition and the distributivity of scalar 
multiplication over addition follow immediately. What does remain to be checked 
is that H,,(C) is closed under addition and contains the zero “vector” (in this case, 
the zero matrix), both of which are easily verified. 

As far as physical applications go, we know that physical observables in quantum 
mechanics are represented by Hermitian operators, and if we are dealing with 
a finite-dimensional ket space such as those mentioned in Example 2.2 then 
observables can be represented as elements of H, (C). As an example one can take 
a fixed spin 1/2 particle whose ket space is C?; the angular momentum operators are 

1 


then represented as L; = zi> where the o; are the Hermitian Pauli matrices 


01 _ (0-i _(10 
ea w= (i a) a= (35) Pa 


o 


Box 2.2 H, (C) is not a Complex Vector Space 

One interesting thing about H, (C) is that even though the entries of its matrices 
can be complex, it does not form a complex vector space; multiplying a 
Hermitian matrix by 7 yields an anti-Hermitian matrix, as you can check, so 
H,,,(C) is not closed under complex scalar multiplication. Thus H,,(C) is only 
a real vector space, even though its matrices contain complex numbers. This 
point is subtle but worth understanding! 


9 
= 


Example 2.5. L?([a,b]), Square-integrable complex-valued functions on an 
interval 


3 Hermitian matrices being those which satisfy At = (AT)* = A where superscript T denotes the 
transpose and superscript * denotes complex conjugation of the entries. 


4 Another footnote for the mathematically inclined: as discussed later in this example, though, 
H, (©) is only a real vector space, so it is only a subspace of M,,(C) when M, (C) is considered as 
areal vector space. 
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This example is fundamental in quantum mechanics. A complex-valued function f 
on [a, b] C R is said to be square-integrable if 


b 
J Iœ dx < 00. (2.4) 
Defining addition and scalar multiplication in the obvious way, 


+8) = f(x) + g(x) 
(cf (x) = cf(x), 


and taking the zero element to be the function which is identically zero (i.e., 
f(x) = 0 for all x) yields a complex vector space. (Note that if we considered only 
real-valued functions then we would only have a real vector space.) Verifying the 
axioms is straightforward though not entirely trivial, as one must show that the sum 
of two square-integrable functions is again square-integrable (Problem 2-2). This 
vector space arises in quantum mechanics as the set of normalizable wavefunctions 
for a particle in a one-dimensional infinite potential well. Later on we’ll consider the 
more general scenario where the particle may be unbound, in which case a = —oo 
and b = œ and the above definitions are otherwise unchanged. This vector space is 
denoted as L? (R). 


Example 2.6. H)(R°) and Hi, The Harmonic Polynomials and the Spherical 
Harmonics 


Consider the set P;(IR*) of all complex-coefficient polynomial functions on R? 
of fixed degree l, i.e. all linear combinations of functions of the form x’ yiz“ 
where i + j + k = l. Addition and (complex) scalar multiplication are defined 
in the usual way and the axioms are again easily verified, so P;(R*) is a vector 
space. Now consider the vector subspace H; (R?) C P;(R*) of harmonic degree 
l polynomials, i.e. degree / polynomials satisfying Af = 0, where A is the 
usual three-dimensional Laplacian. You may be surprised to learn that the spherical 
harmonics of degree / are essentially elements of H; (R°)! To see the connection, 
note that if we write a degree / polynomial 


f(x,y,z) = > cije x' yi zt 
ijk 
i+j+k=l 
in spherical coordinates with polar angle 6 and azimuthal angle ¢, we’ll get 
f(r,0.6) = r'Y(0, $) 


for some function Y(0, @), which is just the restriction of f to the unit sphere. If we 
write the Laplacian out in spherical coordinates, we get 
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ef 20 1 
A= —5+-—+—Ap9, 2.5 
Ər? or T po oy 
where A is shorthand for the angular part of the Laplacian.’ You will show below 
that applying this to f(r,6,¢) = r'Y(0,ġ) and demanding that f be harmonic 
yields 


Ay¥(6,¢) = -I(l + 1) Y(0,¢), (2.7) 


which is the definition of a spherical harmonic of degree /! Conversely, if we 
take a degree / spherical harmonic and multiply it by r’, the result is a harmonic 
function. If we let H; denote the set of all spherical harmonics of degree /, then 
we have established a one-to-one correspondence between Hı and Hı (R?)! The 
correspondence is given by restricting functions in H; (R°) to the unit sphere, and 
conversely by multiplying functions in Hy) by r’, i.e. 


Hı R?) E Hı 


f — fr =1,0,¢) 
r'Y(0,¢) <— Y(0,¢). 


In particular, this means that: 


The familiar spherical harmonics Y} (6,@) are just the restriction of 
particular harmonic degree / polynomials to the unit sphere. 


For instance, consider the case / = 1. Clearly Hı (R?) = P; (R3) since all first- 


degree (linear) functions are harmonic. If we write the functions 


x+1y x —iy 3 
s &3 EH R?) 
V2 woe 


in spherical coordinates, we get 


1 . 1 i 
— re’? sin0, rcos6, ——=re'® sin 0 € Hi (R°). 


v2 V2 


>The differential operator As is also sometimes known as the spherical Laplacian, and is given 
explicitly by 


A Cail VEN # (2.6) 
2 = 302 co A 


30 sin? 6 dg?” 


We won’t need the explicit form of A s2 here. A derivation and further discussion can be found in 
any electrodynamics or quantum mechanics book, like Sakurai [17]. 
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Restricting these to the unit sphere yields 
i in 0 Cc 0 i in 6 E H 
— e $ S , cos 0, ——~e $ S s 
J2 J2 1 


Up to (overall) normalization, these are the usual degree 1 spherical harmonics Y}, 
—1<m<1.The/ = 2 case is treated in Exercise 2.2 below. Spherical harmonics 
are discussed further throughout this text; for a complete discussion, see Sternberg 


[19]. 
Exercise 2.1. Verify (2.7). 


Exercise 2.2. Consider the functions 


$(x + iy)’, —z(x + iy), 02 x? — y?), —2(x — iy), $(x — iy)? € P2(R’). (2.8) 


Verify that they are in fact harmonic, and then write them in spherical coordinates and 
restrict to the unit sphere to obtain, up to normalization, the familiar degree 2 spherical 
harmonics Y2, —2 < m < 2. 


m?’ 


Non-example GL (n, R), invertible n x n matrices 

The “general linear group” GL (n, R), defined to be the subset of M, (R) consisting 
of invertible n x n matrices, is not a vector space though it seems like it could be. 
Why not? 


2.2 Span, Linear Independence, and Bases 


The notion of a basis is probably familiar to most readers, at least intuitively: it’s 
a set of vectors out of which we can “make” all the other vectors in a given vector 
space V. In this section we’ll make this idea precise and describe bases for some of 
the examples in the previous section. 

First, we need the notion of the span of a set of vectors. If 


S = {v1,v2,...,u¢} C V is a set of k vectors in V, then the span of S, denoted 
Span {v1,v2,..., vx} or Span S, is defined to be just the set of all vectors of the 
form 


cy, te7m+...tc%r, che C. 


Such vectors are known as linear combinations of the v;, so Span S is just 
the set of all linear combinations of the vectors in S. For instance, if S$ = 
{(1,0,0), (0, 1,0)} C R°, then Span S is just the set of all vectors of the form 
(c!,c?,0) with c!,c? € R. If S has infinitely many elements, then the span of S 
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is again all the linear combinations of vectors in S, though in this case the linear 
combinations can have an arbitrarily large (but finite) number of terms.° 

Next we need the notion of linear dependence: a (not necessarily finite) set 
of vectors S is said to be linearly dependent if there exists distinct vectors 


V1, U2,...+,Um in S and scalars c!, c?,...,c’”, not all of which are 0, such that 


clu +0702 +--- +c" Um = 0. (2.9) 


What this definition really means is that at least one vector in S can be written as 
a linear combination of the others, and in that sense is dependent (you should take 
a second to convince yourself of this). If S is not linearly dependent, then we say 
it is linearly independent, and in this case no vector in S can be written as a linear 
combination of any others. For instance, the set S = {(1, 0,0), (0, 1,0), (1, 1,0)} C 
R? is linearly dependent, whereas the set S’ = {(1,0,0), (0, 1,0), (0, 1, 1)} is 
linearly independent, as you should check. 

With these definitions in place we can now define a basis for a vector space V 
as an ordered linearly independent set B C V whose span is all of V. This means, 
roughly speaking, that a basis has enough vectors to “make” all of V, but no more 
than that. When we say that B = {v,,..., vg} is an ordered set we mean that the 
order of the v; is part of the definition of $, so another basis with the same vectors 
but a different order is considered distinct. The reasons for this will become clear as 
we progress. 

One can show’ that all finite bases must have the same number of elements, so 
we define the dimension of a vector space V, denoted dim V, to be the number of 
elements of any finite basis. If no finite basis exists, then we say that V is infinite- 
dimensional. 

Basis vectors are often denoted e;, rather than v;, and we will use this notation 
from now on. 


Exercise 2.3. Given a vector v and a finite basis B = {e;};—1..,, Show that the expression 
of v as a linear combination of the e; is unique. 


Example 2.7. The complex plane C as both a real and complex vector space 


We begin with a trivial example, as it illustrates the role of the scalars C. Consider 
the vector spaces C” and Ch from Example 2.2 and Box 2.1, where the first space 
is a complex vector space but the second is only real. Now let n = 1. Then the first 
space is just C, and the number 1 (or any other nonzero complex number, for that 
matter) spans the space since we can write any z € C as z = z- 1 € Span{1}. Since 
a set with just one nonzero vector is trivially linearly independent (check!), 1 is a 
basis for C and C is one-dimensional. 


co N 
6We don’t generally consider infinite linear combinations like J cvi = lim ) cÍ v; because 
N->oco 
i=l i=l 


in that case we would need to consider whether the limit exists, i.e. whether the sum converges in 
some sense. More on this later. 


7See Hoffman and Kunze [13]. 
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For the second space Cg, however, | does not span the space: scalar multiplying 
1 by only real numbers cannot yield arbitrary complex numbers. To span the space 
we need a complex number such as i. The set {1,7} then spans Cp since any z € C 
can be written as z = a-1+-i, where a,b € R. Thus, Cp is two-dimensional. 

What we see, then, is that even though C and Cp are identical as sets, they differ 
in vector space properties such as their dimensionality. We sometimes express this 
by saying that C has complex dimension one but real dimension two. 

Furthermore, we usually think about complex numbers as z = a + bi and 
visualize them as the “complex plane”; when we do this we are really thinking 
about Cp rather than C. So, even though it may seem strange to take C = R when 
we could take C = C, this is exactly what we usually do with complex numbers! 


Example 2.8. R” and C" 


R” has the following natural basis, also known as the standard basis: 


(1,0,...,0), 
(0, 1,...,0), 
(2.10) 
(0,...,1,0), 
(0,...,0, 1). 


You should check that this is indeed a basis, and thus that the dimension of R” is, 
unsurprisingly, n. The same set serves as a basis for C”, provided of course that you 
take C = C8 

Note that although the standard basis is the most natural one for R” and C”, there 
are infinitely many other perfectly respectable bases out there; you should check, 
for instance, that {(1,1,0,...,0), (0,1,1,0,...,0), ... , (0,...,1, 1), (,0,..., 
0, 1)} is also a basis when n > 2. 


Example 2.9. M,,(IR) and M,(C) 


Let E;; be the n xn matrix with a 1 in the ith row, jth column and zeros everywhere 
else. Then you can check that {ÆE;; yi, j=1,..n is a basis for both M, (R) and M, (C), 
and that both spaces have dimension n*. Again, there are other nice bases out there; 
for instance, the symmetric matrices $;; = Ej; + Eji, i < j, and antisymmetric 
matrices Ajj = Ej; — Ej;,i < j taken together also form a basis for both M, (R) 
and M,,(C). 


8If you take C = R, then you need to multiply the basis vectors in (2.11) by i and add them to the 
basis set, giving C” a real dimension of 2n. 
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Exercise 2.4. Let S,, (IR), A, (R) be the sets of nxn symmetric and antisymmetric matrices, 
respectively. Show that both are real vector spaces, compute their dimensions, and check 
that dim S,,(R) + dim A, (R) = dim M,,(R), as expected. 


Exercise 2.5. What is the real dimension of M,,(C)? 
Example 2.10. H>(C) 


Let’s find a basis for H2(C). First, we need to know what a general element of 
H}(C) looks like. In terms of complex components, the condition A = A’ reads 


ab ac 
=(-- 2.11 
CE eD 
where the bar denotes complex conjugation. This means that a,d € R and b = ¢, 
so in terms of real numbers we can write a general element of H2(C) as 


t x—i 
( T ”) = 11 dro + yoy + 2; (2.12) 
x+iy t—z i 
where J is the identity matrix and 0x, 0y , 0; are the Pauli matrices defined in (2.3). 
You can easily check that the set B = {1, ox, 0y, 0z} is linearly independent, and 
since (2.12) shows that B spans H>(C), B is a basis for H2(C). We also see that 
dim H2 (C) = 4. 
Exercise 2.6. Using the matrices S;; and A;; from Example 2.9, construct a basis for 
H,,(C) and compute its dimension. 


Exercise 2.7. H2(C) and M2(C) are both four-dimensional, yet H2(C) is clearly a proper 
subset of M,(C), and hence should have lower dimensionality. Explain the apparent 
paradox. 


Example 2.11. Y! (6, ¢) 


We saw in the previous section that the Y! are elements of H1, which can be 
obtained from (R*) by restricting to the unit sphere. What’s more is that the 
set {Y} i tenet is actually a basis for Hy. In the case / = 1 this is clear: we 
have Hı (R?) = P(R*) and clearly {F(x + iy), z -lz — iy)} is a basis, 
and restricting this basis to the unit sphere gives the / = 1 spherical harmonics. For 
l > 1 proving our claim requires a little more effort; see Problem 2-3. 

Another, simpler basis for Hı (IR?) would be the cartesian basis {x, y, z}; physi- 
cists use the spherical harmonic basis because those functions are eigenfunctions 
of the orbital angular momentum operator L, which on H;(R?*) is represented? by 
L,=-i (xe — yÈ). We shall discuss the relationship between the two bases in 
detail later. 


° As mentioned in the preface, the Å which would normally appear in this expression has been set 
to1. 
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Not Quite Example L?({—a, a]) 

From doing 1-D problems in quantum mechanics one already “knows” that the set 
{ela bnez is a basis for L7([—a, a]). There’s a problem, however; we’re used to 
taking infinite linear combinations of these basis vectors, but our definition above 
only allows for finite linear combinations. What’s going on here? It turns out that 
L?({[-a,a]) has more structure than your average vector space: it is an infinite- 
dimensional Hilbert space, and for such spaces we have a generalized definition 
of a basis, one that allows for infinite linear combinations. We will discuss Hilbert 


spaces in Sect. 2.6. 


2.3 Components 


One of the most useful things about introducing a basis for a vector space is that 
it allows us to write elements of the vector space as n-tuples, in the form of either 
column or row vectors, as follows: Given v € V anda basis B = {e;};=1..» for V, 
we can write 


for some numbers v’, called the components of v with respect to B. We can then 
represent v by the column vector, denoted [v]g, as 


v 
j 
[ule = 
v" 
or the row vector 
[ele = (vl, v, oe 0”), 


where the superscript T denotes the usual transpose of a vector. The subscript $ just 
reminds us which basis the components are referred to, and will be dropped if there 
is no ambiguity. With a choice of basis, then, every n-dimensional vector space can 
be made to “look like” R” or C”. Writing vectors in this way greatly facilitates 
computation, as we’ll see. One must keep in mind, however, that vectors exist 
independently of any chosen basis, and that their expressions as row or column 
vectors depend very much on the choice of basis B. 
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To make this last point explicit, let’s consider two bases for R?: the standard basis 


e = (1,0,0) 
e2 = (0,1,0) 
e3 = (0,0, 1), 


and the alternate basis we introduced in Example 2.8 


el = G, 1,0) 
e, = (0,1,1) 
@, = (1,0,1). 


We’ll refer to these bases as B and B’, respectively. Let’s consider the components 
of the vector e; in both bases. If we expand in the basis 8, we have 


ey = l-e +1l-e+0:-e3 
sO 


1 
lels = | 1 (2.13) 
0 


as expected. If we expand e} in the basis 6’, however, we get 
ep =1-e)+0-e40-2e, 


and so 
1 
leil ={ 0]. 
0 


which is of course not the same as (2.13).!° For fun, we can also express the standard 
basis vector e; in the alternate basis B’; you can check that 


oe D i L Lar 
e = 381 — 32 + 383, 


10The simple form of [e{]g is no accident; you can easily check that if you express any set of basis 
vectors in the basis that they define, the resulting column vectors will just look like the standard 
basis. 
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and so 


1/2 
[eile = | -1/2 
1/2 


The following examples will explore the notion of components in a few other 
contexts. 


Example 2.12. Rigid Body Motion 


One area of physics where the distinction between a vector and its expression as an 
ordered triple is crucial is rigid body motion. In this setting we usually deal with 
two bases, an arbitrary but fixed space axes K' = {x', y’,z'} and a time-dependent 
body axes K = {x(t), y(t), z(t)} which is fixed relative to the rigid body. These 
are illustrated in Fig. 2.1. When we write down vectors like the angular momentum 
vector L or the angular velocity vector w, we must keep in mind what basis we are 
using, as the component expressions will differ drastically depending on the choice 
of basis. For example, if there is no external torque on a rigid body, [L]x will be 
constant whereas [L]x will in general be time-dependent. 


Example 2.13. Different bases for C? 


As mentioned in Example 2.4, the vector space for a spin 1/2 particle is identifiable 
with C? and the angular momentum operators are given by L; = loj. In particular, 
this means that 


Fig. 2.1 Depiction of the 
fixed space axes K’ and the 
time-dependent body axes K, 
in gray. K is attached to the 
rigid body and its basis 
vectors will change in time as 
the rigid body rotates 
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and so the standard basis vectors 


a= (o) 
1 (o 
iy = (2) (2.14) 


are eigenvectors of L, with eigenvalues of 1/2 and —1/2 respectively. Let’s say, 
however, that we were interested in measuring Ly, where 


Then we would need to use a basis B’ of Ly eigenvectors, which you can check are 
given by 


A 
N 
| 
àl- 
N 
e 
Loa 
Se 


If we’re given the state e} and are asked to find the probability of measuring 
Lx = 1/2, then we need to expand e; in the basis B’, which gives 


1 1 
aa an _ A 
el a Write 
and so 
A 
ew = (32). (2.15) 
V2 


This, of course, tells us that we have a probability of 1/2 for measuring Ly = +1/2 
(and the same for Ly = —1/2). Hopefully this convinces you that the distinction 
between a vector and its component representation is not just pedantic, but can be of 
real physical importance. In fact, the two different component representations (2.14) 
and (2.15) of the same vector e; are precisely what is needed to understand the non- 
intuitive results of the Stern—Gerlach experiment; see Chap. 1 of Sakurai [17] for 
details. 
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Example 2.14. L?([—a, a]) 


We know from experience in quantum mechanics that all square-integrable func- 
tions on an interval [—a, a] have an expansion!! 


D j Max 
f = Cme a 


mx 


in terms of the “basis” {exp(i == )}mez. This expansion is known as the Fourier 
series of f , and we see that the c,, commonly known as the Fourier coefficients, 
are nothing but the components of the vector f in the basis {e "« }mez. 


2.4 Linear Operators 


One of the basic notions in linear algebra, fandamental in quantum mechanics, is 
that of a linear operator. A linear operator on a vector space V is a function T from 
V to itself satisfying the linearity condition 


T(cv +w) = cT(v)+ T(w). (2.16) 


Sometimes we write Tv instead of T (v). You should check that the set of all linear 
operators on V, with the obvious definitions of addition and scalar multiplication, 
forms a vector space, denoted L(V). You have doubtless seen many examples of 
linear operators: for instance, we can interpret a real n xn matrix as a linear operator 
on R” that acts on column vectors by matrix multiplication. Thus M,,(R) (and, 
similarly, M„(C)) can be viewed as vector spaces whose elements are themselves 
linear operators. In fact, that was exactly how we interpreted the vector subspace 
Ay(C) C M2(C) in Example 2.4; in that case, we identified elements of H2(C) as 
the quantum-mechanical angular momentum operators. There are numerous other 
examples of quantum-mechanical linear operators, for instance the familiar position 
and momentum operators ĉ and ĵ that act on L?([—a, a]) by 


(Xf) (x) = xf(x) 


ONW =- 
X 


''This fact is proved in most real analysis books; see Rudin [16]. 
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as well as the angular momentum operators Lx, Ly, and L; which act on P; (R°) by 


a) 
d 
E 
L(f)= -i (2 -s£ (2.17) 


Another class of less familiar examples is given below. 
Example 2.15. L(V) acting on L(V) 


We are familiar with linear operators taking vectors into vectors, but they can also be 
used to take linear operators into linear operators, as follows: Given A, B € L(V), 
we can define a linear operator ad4 E€ L(L(V)) acting on B by 


ad4(B) = [A, B], 


where [-, -] indicates the commutator. Note that ad, is a linear operator on a (vector) 
space of linear operators! This action of A on L(V) is called the adjoint action or 
adjoint representation. 

The adjoint representation has important applications in quantum mechanics; for 
instance, the Heisenberg picture emphasizes L(V) rather than V and interprets the 
Hamiltonian as an operator in the adjoint representation. In fact, for any observable 
A the Heisenberg equation of motion reads 


dA 
Fy = i ad (A). (2.18) 


The adjoint representation is also essential in understanding spherical tensors, which 
we’ ll discuss in detail in Chap. 6. E 


Box 2.3 Invertibility for Linear Operators 

One important property of a linear operator T is whether or not it is invertible, 
i.e. whether there exists a linear operator T7! such that TT! = TU'T = J, 
where J is the identity operator. !? You may recall that, in general, an inverse for 


Throughout this text J will denote the identity operator or identity matrix; it will be clear from 
context which is meant. 
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a map F between two arbitrary sets A and B (not necessarily vector spaces!) 
exists if and only if F is both one-to-one, meaning 


F(a) = F(a) =} a) =a. Va,a EA, 
and onto, meaning that 
Yb € B there exists a € A such that F(a) = b. 


If this is unfamiliar, you should take a moment to convince yourself of this. 
In the particular case of a linear operator T on a vector space V (so that now 
instead of considering a generic map F : A — B, we’re considering a linear 
map T : V — V), these two conditions are actually equivalent. You’ll prove 
this in Exercise 2.8 below. Furthermore, these two conditions turn out also to 
be equivalent to the statement 


T(v) =0 = v=0, (2.19) 


as you'll show in Exercise 2.9. Equation (2.19) thus gives us a necessary and 
sufficient criterion for invertibility of a linear operator, which we may express 
as follows: 


T is invertible if and only if the only vector it sends to 0 is the zero vector. 


Exercise 2.8. Suppose V is finite-dimensional and let T € L(V). Show that T being one- 
to-one is equivalent to T being onto. Feel free to introduce a basis to assist you in the proof. 


Exercise 2.9. Suppose T(v) = 0 == v = Q. Show that this is equivalent to T being 
one-to-one, which by the previous exercise is equivalent to T being one-to-one and onto, 
which is then equivalent to T being invertible. 


An important point to keep in mind is that a linear operator is not the same 
thing as a matrix; just as with vectors, the identification can only be made once a 
basis is chosen. For operators on finite-dimensional spaces this is done as follows: 
choose a basis B = {e;};=..,. Then the action of T is determined by its action on 
the basis vectors, 


Tw) =T|(Y ve] = vite) = Y vT, ej, (2.20) 
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where the numbers 7; a again called the components of T with respect to B,'° are 
defined by 


T(e:) = XT, "ej. (2.21) 


j=l 


Note that 7; i= T (e;)/, the jth component of the vector T (e;). We now have 


v! X= vi T, : 
v? D= vT,’ 
bls=] ` | and [TOs = 


n [ 
v Dh OT" 


which looks suspiciously like matrix multiplication. In fact, we can define the 
matrix of T in the basis B, denoted [T ]g, by the matrix equation 


[7 ls = (Tlalele | 


where the product on the right-hand side is given by the usual matrix multiplication. 
Comparison of the last two equations then shows that 


T T,!... T! 
T,?T,?...T,? 

[Te=| = | | |. (2.22) 
T” T,” ons L” 


Thus, we really can use the components of T to represent it as a matrix, and once we 
do so the action of T becomes just matrix multiplication by [T]g! Furthermore, if 
we have two linear operators A and B and we define their product (or composition) 
AB as the linear operator 


(AB)(v) = A(B(v)), 


you can then show that [AB] = [A][B]. Thus, composition of operators becomes 
matrix multiplication of the corresponding matrices. 


'3Nomenclature to be justified in the next chapter. 
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Box 2.4 On Matrix Indices 

One caveat about (2.22): the left (lower) index of T, 7 labels the column, not 
the row, in contrast to the usual convention (which we will also employ in this 
text, mostly in Part II). It’s also helpful to note that the ith column of (2.22) is 
just the column vector [T (e;)]. 


Exercise 2.10. For two linear operators A and B on a vector space V, show that [AB] = 
[A][B] in any basis. 


Example 2.16. L.,7)(R*) and Spherical Harmonics 
Recall that Hı (R?) is the set of all linear functions on R? and that 
1 


1 
15, T ga —ty) 


micl<m<l = ea 


and {x,y,z} 


are both bases for this space.'* Now consider the familiar angular momentum 


operator L, = —i CÈ — yè) on this space. You can check that 
Ea +iy) = Sly + iy) 
V2 ° v2 
L,(z) = 0 
__1,(x-iy) = L@-iy), 
V2 ° v2 


which implies by (2.21) that the components of L, in this basis are 


(L) =1 
(L)? = (L)? =0 

(oy =0VYi 

(L); = -1 


(L); = (La)? =0. 
Thus in the spherical harmonic basis, 


10 0 
[Lelpvi; = | 00 0 
00-1 


'4We have again ignored the overall normalization of the spherical harmonics to avoid unnecessary 
clutter. 
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This of course just says that the wavefunctions x, % and = have L, 
eigenvalues of 1,0, and —1 respectively. Meanwhile, 


L(x) = iy 
Ly) = —ix 
L(z) = 0 
so in the cartesian basis, 
0 -i 


0 
[Lely = 0}, (2.23) 
0 


i 0 
0 0 
a very different looking matrix. Though the cartesian form of L; is not usually used 
in physics texts, it has some very important mathematical properties, as we’ll see in 
Part II of this book. For a preview of these properties, see Problem 2-4. 


Exercise 2.11. Compute the matrices of L, = —i (y 2 — zÈ) and L, = -i¢e — xÈ) 


acting on Hı (R°) in both the cartesian and spherical harmonic bases. 


Before concluding this section we should remark that there is much more one can 
say about linear operators, particularly concerning eigenvectors, eigenvalues, and 
diagonalization. Though these topics are relevant for physics, we will not need them 
in this text and good references for them abound, so we omit them. The interested 
reader can consult the first chapter of Sakurai [17] for a practical introduction, or 
Hoffman and Kunze [13] for a thorough discussion. 


2.5 Dual Spaces 


Another basic construction associated with a vector space, essential for understand- 
ing tensors and usually left out of the typical “mathematical methods for physicists” 
courses, is that of a dual vector. Roughly speaking, a dual vector is an object that eats 
a vector and spits out a number. This may not sound like a very natural operation, 
but it turns out that this notion is what underlies bras in quantum mechanics, as well 
as the raising and lowering of indices in relativity. We’ll explore these applications 
in Sect. 2.7. 

Now for the precise definitions. Given a vector space V with scalars C, a dual 
vector (or linear functional) on V is a C-valued linear function f on V, where 
“linear” again means 


fev +w) = cf(v) + fw). (2.24) 
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(Note that f(v) and f(w) are scalars, so that on the left-hand side of (2.24) the 
addition takes place in V, whereas on the right side it takes place in C.) The set of 
all dual vectors on V is called the dual space of V, and is denoted V*. It’s easily 
checked that the usual definitions of addition and scalar multiplication and the zero 
function turn V* into a vector space over C. 

The most basic examples of dual vectors are the following: let {e;} be a (not 
necessarily finite) basis for V, so that an arbitrary vector v can be written as v = 
J; v'e;. Then for each i, we can define a dual vector e' by 


e (v) =v. (2.25) 


This just says that e’ “picks off” the ith component of the vector v. Note that in 
order for this to make sense, a basis has to be specified. 

A key property of any dual vector f is that it is entirely determined by its values 
on basis vectors. By linearity we have 


fo) =f (>: ve] 


i=l 


= 2 vi f (ei) 


i=l 


i 6. 


= 3 v f, (2.26) 


i=l 


where in the last line we have defined 


Fro] 


which we unsurprisingly refer to as the components of f in the basis {e;}. To justify 
this nomenclature, notice that the e’ defined in (2.25) satisfy 


e' (ej) = ô}. (2.27) 


If V is finite-dimensional with dimension n, it’s then easy to check (by evaluating 
both sides on basis vectors) that we can write!’ 


J=} fie 


i=l 


‘If V is infinite-dimensional, then this may not work as the sum required may be infinite, and as 
mentioned before care must be taken in defining infinite linear combinations. 


32 2 Vector Spaces 


so that the f; really are the components of f. Since f was arbitrary, this means 
that the e’ span V*. In Exercise 2.12 below you will show that the e’ are actually 
linearly independent, so {e'};=1. » is actually a basis for V*. Because of this and 
the relation (2.27) we sometimes say that the e’ are dual to the e;. Note that we have 
shown that V and V* always have the same dimension. We can use the dual basis 
{ei} = B* to write f in components, 


In terms of the row vector | f lee we can write (2.26) as 


fŒ) = [flpevle = [f] f], 


where in the last equality we dropped the subscripts indicating the bases. Again, we 
allow ourselves to do this whenever there is no ambiguity about which basis for V 
we’re using, and in all such cases we assume that the basis being used for V* is just 
the one dual to the basis for V. 

Finally, since e' (v) = v? we note that we can alternatively think of the ith 
component of a vector as the value of ef on that vector. This duality of viewpoint 
will crop up repeatedly in the rest of the text. 


Exercise 2.12. By carefully working with the definitions, show that the e' defined in (2.25) 
and satisfying (2.27) are linearly independent. 


Example 2.17. Dual spaces of R", C”, My, (R) and M, (C) 


Consider the basis {e;} of R” and C”, where e; is the vector with a 1 in the ith place 
and 0’s everywhere else; this is just the standard basis described in Example 2.8. 
Now consider the element f/ of V* which eats a vector in R” or C” and spits 
out the jth component; clearly, f/(e;) = 6/ so the f/ are just the dual vectors 
ei described above. Similarly, for M,,(R) or M,,(C) consider the dual vector f} 
defined by f” (A) = Aj;; these vectors are clearly dual to the £;; and thus form 
the corresponding dual basis. While the f} may seem a little unnatural or artificial, 
you should note that there is one linear functional on M, (R) and M,,(C) which is 
familiar: the trace functional, denoted Tr and defined by 


Tr(A) = D Aii. 


i=l 


Can you express Tr as a linear combination of the f”? 
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Not Quite Example Dual space of L? ({—a, a]) 

We haven’t yet properly treated L?([—a, a]) so we clearly cannot yet properly treat 
its dual, but we would like to point out here that in infinite-dimensions, dual spaces 
get much more interesting. In finite-dimensions, we saw above that a basis {e;} for 
V induces a dual basis {e’} for V*, so in a sense V* “looks” very much like V. 
This is not true in infinite dimensions; in this case, we still have linear functionals 
dual to a given basis, but these may not span the dual space. Consider the case of 
L?({—a, a]); you can check that {e”},,¢z defined by 


=j Max 


eO) = xo fet adx 


max nT x 


satisfy e"(e’ « ) = 8” and are hence dual to fel" } mez. In fact, these linear 
functionals just eat a function and spit out its nth Fourier coefficient. There are 
linear functionals, however, that can’t be written as a linear combination of the e’; 
one such linear functional is the Dirac delta functional 5, defined by 


5(f(x)) = f0). (2.28) 


You are probably instead used to the Dirac delta function, which is a “function” 6(x) 
with the defining property that 


I 5(x) f(x)dx = f(0) Yf € L*((-a,a]). (2.29) 


Note the similarity, in that both are used to evaluate arbitrary functions at 0. Later, 
in Sect.2.7, we will clarify the nature of the Dirac delta “function” 6(x) and its 
relationship to the Dirac delta functional ô, as well as prove that ô can’t be written 
as a linear combination of the e’. 


2.6 Non-degenerate Hermitian Forms 


Non-degenerate Hermitian forms, of which the Euclidean dot product, Minkowski 
metric, and Hermitian scalar product of quantum mechanics are but a few examples, 
are very familiar to most physicists. We introduce them here not just to formalize 
their definition but also to make the fundamental but usually unacknowledged 
connection between these objects and dual spaces. 

A non-degenerate Hermitian form on a vector space V is a C-valued function 
(-| -) which assigns to an ordered pair of vectors v, w € V a scalar, denoted (v|w), 
having the following properties: 


1. (v|wi + cw2) = (v|w1) + c(v|w2) (linearity in the second argument) 
2. (v|w) = (w|v) (Hermiticity; the bar denotes complex conjugation) 
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3. For each v Æ 0 € V, there exists w € V such that (v|w) Æ 0 (non-degeneracy) 


Note that conditions 1 and 2 imply that (cv|w) = C(v|w), so (-|-) is conjugate- 
linear in the first argument. Also note that for a real vector space, condition 2 implies 
that (-|-) is symmetric, i.e. (v|w) = (w|v)!°; in this case, (-|-) is called a metric. 
Condition 3 is not immediately intuitive but means that any two nonzero vectors 
v, v’ € V can always be distinguished by (- | -) ; see Box 2.5 for the precise statement 
of this. 

If, in addition to the above three conditions, the Hermitian form obeys 


4. (v|v) > 0 for all v € V, v Æ 0 (positive-definiteness) 


then we say that (- | -) is an inner product, and a vector space with such a Hermitian 
form is called an inner product space. In this case we can think of (v|v) as the 
“length squared” of the vector v, and the notation 


llull = Vole) 


for the length (or norm) of v is sometimes used. Note that condition 4 implies 3 
(why?). Our reason for separating condition 4 from the rest of the definition will 
become clear when we consider the examples. 

One very important use of non-degenerate Hermitian forms is to define preferred 
sets of bases known as orthornormal bases. Such bases B = {e;} by definition 
satisfy (e;|e;) = +6;; and are extremely useful for computation, and ubiquitous 
in physics for that reason. If (-|-) is positive-definite (hence an inner product), 
then orthonormal basis vectors satisfy (e;|e;) = 6;; and may be constructed out 
of arbitrary bases by the Gram-Schmidt process. If (-|-) is not positive-definite, then 
orthonormal bases may still be constructed out of arbitrary bases, though the process 
is slightly more involved. See Hoffman and Kunze [13], Sections 8.2 and 10.2 for 
details. 


Box 2.5 The Meaning of Non-degeneracy 

The non-degeneracy condition given above is somewhat opaque. What does 
it mean, and what does it have to do with degeneracy? To answer this, 
suppose that the condition is violated, so that there exists v* € V such that 
(v*|w) = 0 Vw € V. Then for any other v € V, the vectors v and v + v* are 
indistinguishable by (- | -), i.e. 


(v + v*|w) = (v|w) + (v*|w) = (v|w) forall we V. 


It is in this sense that v and v + v* are degenerate. You will show in 
Exercise 2.14 below that the converse is also true, i.e. that if two vectors in 


'6Tn this case, (-|-) is linear in the first argument as well as the second and would be referred to as 
bilinear. 
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V are degenerate in this sense then the non-degeneracy condition is violated. 
Thus, the non-degeneracy condition is exactly what is required to ensure non- 
degeneracy! 


Exercise 2.13. Let (-|-) be an inner product. If a set of nonzero vectors @,..., ex is 
orthogonal, i.e. (e;|e;) = 0 when i Æ j, show that they are linearly independent. Note 
that an orthonormal set (i.e., (e;|e;) = +6;;) is just an orthogonal set in which the vectors 
have unit length. 


Exercise 2.14. Let v,v’ be two nonzero vectors in V. Show that if (v|w) = 
(v’'|w) Ww € W, then condition 3 above is violated. 


Example 2.18. The dot product (or Euclidean metric) on R" 
Letv = (v!,...,v"), w = (w!,..., w”) € R”. Define (-| -) on R” by 


(v|w) = y viw. 


i=l 


This is sometimes written as v - w. You can check that (- | -) is an inner product, and 
that the standard basis given in Example 2.8 is an orthonormal basis. 


Example 2.19. The Hermitian scalar product on C” 


Let v = (v!,...,v"), w = (w!,..., w”) € C”. Define (: |-) on C” by 


ww =J iw. (2.30) 


i=l 


Again, you can check that (- | -) is an inner product, and that the standard basis given 
in Example 2.8 is an orthonormal basis. Such inner products on complex vector 
spaces are sometimes referred to as Hermitian scalar products and are present on 
every quantum-mechanical vector space. In this example we see the importance of 
condition 2, manifested in the conjugation of the v’ in (2.30); if that conjugation 
wasn’t there, a vector like v = (i,0,...,0) would have (v|v) = —1 and (-|-) 
wouldn’t be an inner product. 


Exercise 2.15. Let A, B € M,,(C). Define (-|-) on M,,(C) by 
1 f 
(A|B) = 5 (4's). (2.31) 


Check that this is indeed an inner product by confirming properties 1 through 4 above (if 
you’ve been paying close attention, though, you’ll see that it’s only necessary to check 
properties 1, 2, and 4). Also check that the basis {7, ox, Oy, o-} for H2(C) is orthonormal 
with respect to this inner product. 
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Example 2.20. The Minkowski Metric on 4-D Spacetime 


Consider two vectors!’ v; = (xj, Yi, zi, ti) € Rf, i = 1,2. The Minkowski metric, 
denoted 7, is defined to be!® 


N(V1, V2) = X1X2 + Viy2 + 2122 — hh. (2.32) 


n is clearly linear in both its arguments (i.e., it’s bilinear) and symmetric, hence 
satisfies conditions | and 2, and you will check condition 3 in Exercise 2.16 below. 
Notice that for v = (1,0,0, 1), n(v, v) = 0 so n is not positive-definite, hence not 
an inner product. This is why we separated condition 4, and considered the more 
general non-degenerate Hermitian forms instead of just inner products. 


Exercise 2.16. Let v = (x, y,z,f) be an arbitrary nonzero vector in R*. Show that 7 is 
non-degenerate by finding another vector w such that n(v, w) Æ 0. 


We should point out here that the Minkowski metric can be written in components 
as a matrix, just as a linear operator can. Taking the standard basis B = {e;};=1 
in R4, we can define the components of n, denoted nij, as 


Nij = (ei, ej). 


Then, just as was done for linear operators, you can check that if we define the 
matrix of ņ in the basis B, denoted [n]g, as the matrix 


Nii N21 N31 N41 100 0 
N12 N22 N32 N42 010 0 
= = ; (2.33) 
hle N13 N23 N33 143 001 0 
14 N24 N34 N44 000-1 
then we can write 
100 0 Dal 
0100 
nvi, v2) = [v2]" [mvi] = (x2, y2, z2, t2) 71 (2.34) 
000-1 ti 


as some readers may be used to from computations in relativity. Note that the 
symmetry of 7 implies that [7] is a symmetric matrix for any basis B. 


These are often called “events” in the physics literature. 


'8We are, of course, arbitrarily choosing the + + +— signature; we could equally well choose 
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Example 2.21. The Hermitian scalar product on L?({—a, a) 
For f,g € L?([-a, a]), define 


i fe 
ID= | fe dx. (2.35) 
a 


=g 


You can easily check that this defines an inner product on L?([—a,a]), and that 
fe! @ bnez is an orthonormal set. What’s more, this inner product turns L?([—a, a]) 
into a Hilbert Space, which is an inner product space that is complete. The notion 
of completeness is a technical one, so we will not give its precise definition, but 
in the case of L?({[—a,a]) one can think of it as meaning roughly that a limit of 
a sequence of square-integrable functions is again square-integrable. Making this 
precise and proving it for L*([—a, a]) is the subject of real analysis textbooks and far 
outside the scope of this text,!° so we’ll content ourselves here with just mentioning 
completeness and noting that it is responsible for many of the nice features of Hilbert 
spaces, in particular the generalized notion of a basis which we now describe. 

Given a Hilbert space H and an orthonormal (and possibly infinite) set {e;} C H, 
the set {e;} is said to be an orthonormal basis for H if 


(e:|f)=0 Vi => f=0. (2.36) 


You can check (see Exercise 2.17 below) that in the finite-dimensional case this 
definition is equivalent to our previous definition of an orthonormal basis. In the 
infinite-dimensional case, however, this definition differs substantially from the old 
one in that we no longer require Span{e;} = H (recall that spans only include finite 
linear combinations). Does this mean, though, that we now allow arbitrary infinite 
combinations of the basis vectors? If not, which ones are allowed? For L?([—a, a]), 
for which {e! m tnez is an orthonormal basis, we mentioned in Example 2.14 that 
any f € L*([—a,a]) can be written as 


f= 3 cele, (2.37) 
n=—oo 
where 
a "| f Pax = 3 len|? < o. (2.38) 
2a =a n=—00 


(The first equality in (2.38) should be familiar from quantum mechanics and follows 
from Exercise 2.18 below.) The converse to this is also true, and this is where the 


19See Rudin [16], for instance, for this and for proofs of all the claims made in this example. 
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completeness of L?({—a, a]) is essential: if a set of numbers c, satisfy (2.38), then 
the series 


g(x) = >) ene (2.39) 


converges, yielding a square-integrable function g. So L?([—a,a]) is the set of 
all expressions of the form (2.37), subject to the condition (2.38). Now we know 
how to think about infinite-dimensional Hilbert spaces and their bases: a basis for 
a Hilbert space is an infinite set whose infinite linear combinations, together with 
some Suitable convergence condition, form the entire vector space. 


Exercise 2.17. Show that the definition (2.36) of a Hilbert space basis is equivalent to our 
original definition of an (orthonormal) basis for a finite-dimensional inner product space V. 


max 


[e6] [e6] 
Exercise 2.18. Show that for f = x ce * and g = x dpi a € 


n=— 00 m>=—Co 
L? ({—a, a]), their inner product can be written as 


co 


(FID = Yo rdn. (2.40) 


n=— 00O 


Thus (-| -) on L?([—a, a]) can be viewed as the infinite-dimensional version of the standard 
Hermitian scalar product on C”. 


Example 2.22. Various inner products on the space of real polynomials P (R) 


All of the examples we’ve seen of non-degenerate Hermitian forms, with the 
exception of the Minkowski metric of Example 2.20, are actually positive definite, 
hence inner products. Furthermore, these inner products probably seem somewhat 
“natural,” in the sense that it’s hard to imagine what other kind of inner product one 
would want to define on, say, R” or C”. This might suggest that inner products are 
an inherent part of the vector spaces they’re defined on, as opposed to additional 
structure that we impose. After all, when does one come across a single vector 
space that has multiple different, useful inner products defined on it? In this example 
we will meet one such vector space, and find that we have met the different inner 
products on it through our study of differential equations in physics. 

The vector space in question is the space P(R) of polynomials in one real 
variable x, with real coefficients. P (R) is just the set of all functions of the form 


f(x) = co + ex + cox? + H cnx", 


where c; € R Vi and n is arbitrary. It is straightforward to verify that with the usual 
addition and scalar multiplication of polynomials, this set is in fact a vector space. 
Since the degree of a polynomial f € P(R) is arbitrary, no finite basis exists and 
this space is infinite-dimensional (for more detail on this, see Exercise 2.19 below). 
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An obvious basis for this space would be 6 = {1, x, Xo: .}, but this doesn’t 
necessarily turn out to be the most useful choice. Instead, it’s more convenient to 
choose an orthonormal basis. This leads immediately to a conundrum, however; 
what inner product do we use to define orthonormality? It turns out that a very 
useful family of inner products is given by 


b 
Fig = f FOWA) dx fig PR), 


where W(x) is a nonnegative weight function. Each different choice of integration 
range [a,b] and weight function W(x) gives a different inner product, and will 
yield different orthonormal bases (one can obtain the bases by just applying the 
Gram-Schmidt process to the basis B = {1,x,x?,x°?,...}). Amazingly, these 
orthonormal bases turn out to be the various orthogonal polynomials (Legendre, 
Hermite, Laguerre, etc.) one meets in studying the various differential equations 
that arise in electrostatics and quantum mechanics! 

As an example, let [a, b] = [—1, 1] and W(x) = 1. Then in Exercise 2.20 below 
you will show that applying Gram-Schmidt to the set S = {1, x, x?, x°} C B yields 
(up to normalization) the first four Legendre polynomials 


P(x) = 1 
Pi(x) =x 


P(x) = 5x? - 1) 
P3(x) = G — 3x). 


Recall that the Legendre polynomials show up in the solutions to the differential 
equation (2.46), where we make the identification x = cos 0. Since —1 < cos 0 <1, 
this explains the range of integration [a, b] = [—1, 1]. 

One can obtain the other familiar orthogonal polynomials in this way as well. For 
instance, let [a,b] = (—oo, oo) and W(x) = er, Again applying Gram Schmidt 
yields the first four Hermite Polynomials 


A(x) = 1 

Hı (x) = 2x 

Hy(x) = 4x? — 2 
H3(x) = 8x? — 12x. 


These polynomials arise in the solution to the Schrödinger equation for a one- 
— x2 . 


dimensional harmonic oscillator. In fact, you may have noticed that W(x) = e™™ is 
just a Gaussian, which is the form of the ground-state wavefunction for this system. 
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Note also that the range of integration [a,b] = (—oo, œo) corresponds to the range 
of the position variable, as expected. 

Further examples are given in Problem 2-9. Remember, though, that the point 
here is that a single vector space may support many different inner products, all of 
which may be of physical relevance! 


Exercise 2.19. Verify that P (R) is a (real) vector space. Then show that P (R) is infinite- 
dimensional by showing that, for any finite set S$ C P(R), there is a polynomial that is not 
in Span S. 


Exercise 2.20. Verify that applying the Gram-Schmidt process to S = {1, x, x”, x3} with 
the inner product 


1 
vis) =f feng dx fige PR) 


yields, up to normalization, the first four Legendre polynomials, as claimed above. Do the 
same for the Hermite polynomials, using [a, b] = (~oo, 00) and W(x) = em 


2.7 Non-degenerate Hermitian Forms and Dual Spaces 


We are now ready to explore the connection between dual vectors and non- 
degenerate Hermitian forms. This will allow us to put bras and kets, covariant and 
contravariant indices, and the Dirac delta function all into their proper context. After 
that we’ll be ready to discuss tensors in earnest in Chap. 3. 

Given a non-degenerate Hermitian form (-|-) on a finite-dimensional vector 
space V, we can associate with any v € V a dual vector v € V* defined by 


ean 


This defines a very important map 


which will crop up repeatedly in this chapter and the next. We’ll sometimes write Ù 
as L(v) or (v|-), and refer to it as the metric dual of v.7° 
Now, L is conjugate-linear since for v = cx + z, where v,z,x € V, 


L(v) = (v|) = (ex + z|) = e(|-) + GI) = CL) + LO, 


0The - in the notation (v|-) signifies the slot into which a vector w is to be inserted, yielding the 
number (v|w). 
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where in the third equality we used the Hermiticity of our non-degenerate Hermitian 
form. A word of warning here: as a map from V — V*, L is conjugate linear, but 
any L(v) in the range of L is a dual vector, hence a fully linear map from V > C. 
In Exercise 2.21 below you will show that the non-degeneracy of (-|-) implies 
that L is one-to-one and onto, so L is an invertible map from V to V*. This allows 
us to identify V with V*, a fact we will discuss further below. 
Exercise 2.21. Use the non-degeneracy of (-|-) to show that L is one-to-one, i.e. that 


L(v) = L(w) == v = w. Combine this with the argument used in Exercise 2.9 to 
show that L is onto as well. 


Box 2.6 e! vs. L(e;) 
Suppose we are given basis vectors {e;};=1... for V. We then have two sets of 
dual vectors corresponding to these basis vectors e; : the dual basis vectors e! 
defined by e’ (e ;) = ô$, and the metric duals L(e;). It’s important to understand 
that, for a given i, the dual vector e’ is not necessarily the same as the metric 
dual L(e;); in particular, the dual basis vector e is defined relative to the whole 
basis {é;}i=1,..., whereas the metric dual L(e;) only depends on what e; is, and 
doesn’t care if we change the other basis vectors. Furthermore, L(e;) depends 
on your non-degenerate Hermitian form (that’s why we call it a metric dual), 
whereas e! does not. In fact, we introduced the dual basis vectors e’ in Sect. 2.5, 
before we even knew what non-degenerate Hermitian forms were! 

You may wonder if there are special circumstances when it is true that 
ei = L(e;); this is Exercise 2.22. 


Exercise 2.22. Given a basis {e;};—1._,, under what circumstances do we have e! = é; for 
alli? 


Let’s now proceed to some examples, where you’ll see that you’re already 
familiar with L from a couple of different contexts. 


Example 2.23. Bras and kets in quantum mechanics 


Let H be a quantum-mechanical Hilbert space with inner product (-|-). In Dirac 
notation, a vector Y € H is written as a ket |y) and the inner product (y|@) 
is written (Y|). What about bras, written as (Y|? What, exactly, are they? Most 
quantum mechanics texts gloss over their definition, just telling us that they are in 
1-1 correspondence with kets and can be combined with kets as (wW|@) to get a 
scalar. We are also told that the correspondence between bras and kets is conjugate- 
linear, i.e. that the bra corresponding to c|w) is ¢(w|. From what we have seen in 
this section, then, we can conclude the following: 


Bras are nothing but dual vectors. 


These dual vectors are labeled in the same way as regular vectors, because the 
map L allows us to identify the two. In short, (y| is really just L(y), or equivalently 


(yI). 
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Example 2.24. Raising and lowering indices in relativity 


pet OEE A TE S JHUS hes 


be the standard basis and dual basis for R* (we use a Greek index to conform 
with standard physics notation?!), and let v = ye ai vie, € R4. What are the 
components of the dual vector v in terms of the v“? Well, as we saw in Sect. 2.5, 
the components of a dual vector are just given by evaluation on the basis vectors, so 


Sy = Fey) = (View) = X v (eslen) = X vhp. (2.42) 


In matrices, this reads 
[v]e« = [n]s[v]s 


so matrix multiplication of a vector by the metric matrix gives the corresponding 
dual vector in the dual basis. Thus, the map L is implemented in coordinates by 
[n]. Now, we mentioned above that L is invertible; what does L~! look like in 
coordinates? Well, by the above, L~! should be given by matrix multiplication by 
[n|"', the matrix inverse to [7]. Denoting the components of this matrix by 7!” 
(so that n‘“n,, = 54’) and writing f = L~!(f) where f is a dual vector, we have 


[fle = [nl s*L fle 


or in components 
fos nfo. (2.43) 


The expressions in (2.42) and (2.43) are probably familiar to you. In physics one 
usually works with components of vectors, and in the literature on relativity the 
numbers v“ are called the contravariant components of v and the numbers v, = 
>, VU’ My Of (2.42) are referred to as the covariant components of v. We see now 
that the contravariant components of a vector are just its usual components, while 
its covariant components are actually the components of the associated dual 
vector Č. For a dual vector f, the situation is reversed—the covariant components 
J, are its actual components, and the contravariant components are the components 
of f . Since L allows us to turn vectors into dual vectors and vice-versa, we usually 
don’t bother trying to figure out whether something is “really” a vector or a dual 
vector; it can be either, depending on which components we use. 

The above discussion shows that the familiar process of “raising” and “lowering” 
indices is just the application of the map L (and its inverse) in components. For an 
interpretation of [n]! as the matrix of a metric on R**, see Problem 2-8. 


21 As long as we're talking about “standard” physics notation, you should also be aware that in many 
texts the indices run from 0 to 3 instead of 1 to 4, and in that case the Oth coordinate corresponds 
to time. 
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Exercise 2.23. Consider R? with the Euclidean metric. Show that the covariant and 
contravariant components of a vector in an orthonormal basis are identical. This explains 
why we never bother with this terminology, nor the concept of dual spaces, in basic physics 
where R? is the relevant vector space. Is the same true for R* with the Minkowski metric? 


Example 2.25. L?([—a, a]) and its dual 


In our above discussion of the map L we stipulated that V should be finite- 
dimensional. Why? If you examine the discussion closely, you’ll see that the only 
place where we use the finite-dimensionality of V is in showing that L is onto. 
Does this mean that L is not necessarily onto in infinite-dimensions? Consider 
the Dirac delta functional 6 € L?({—a,a])* defined in (2.28). Is there a function 
6(x) € L?({-a, a]) such that L(6(x)) = 5? If so, then we’d have 


g0) = 6(g) = C8) = | d(x)gQx)dx, (2.44) 


—a 


and then 6(x) would be nothing but the Dirac delta function defined in (2.29). Does 
such a square-integrable function exist? If we write g as 


oo — 
g(x)= D> dre’, 


n=—Co 


then simply evaluating this at x = 0 gives 


g0)= D> dy = (Ig). (2.45) 


n=—oo 


Comparing this with (2.40) tells us that the function ô(x) must have Fourier 
coefficients c, = 1 for all n. Such c,, however, do not satisfy (2.38), and hence 
6(x) cannot be a square-integrable function. Thus, there is no Dirac delta function, 
only the Dirac delta functional 5. This also means that L is not onto in the case 
of L?([-a, a]). It is so convenient, however, to associate vectors with dual vectors 
that we pretend that the delta function 6(x) exists and can be manipulated (say, by 
differentiation) as an ordinary function. If you look closely, however, you'll find that 
6(x) only crops up in integral expressions, and can always be rewritten in terms of 
the delta functional. We’ll see an example of this in Problem 2-6. 


Epilogue: Tiers of Structure in Linear Algebra 


In this chapter, as is often done in mathematics, we started with some basic notions 
(vectors and vector spaces) and gradually built complexity, culminating in the 
identification of a vector space V and its dual V* via a non-degenerate Hermitian 
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form on V. Let us step back at this point and survey what we’ve done, so that we 
understand the tiers of structure at play. 

Before we could define an abstract vector space, we needed a candidate set V. 
Initially, this set, whatever it was (a set of n-tuples of numbers, or matrices, or 
functions, etc.), had no structure at all; it was just a set with elements. This is the 
first level of our hierarchy. To construct a vector space, we needed to combine this 
set with some notion of addition (really, just some map + : V x V — V) along 
with some notion of scalar multiplication (really, just some map from C x V > V) 
that satisfied the axioms of Sect. 2.1. These notions are additional structure that one 
adds to V, and one calls the result a “vector space” only if those specific notions are 
compatible with the vector space axioms. Vector spaces are the second level of our 
hierarchy; see Table 2.1. 

That last paragraph might seem pedantic. If V = R”, for instance, do we really 
consider the notion of addition of vectors to be extra structure? What else could 
addition be, in this case? Is it really useful or necessary to separate the set V from 
the obvious addition operation it carries? These questions may be reasonable for R”, 
but it’s important to note that there are plenty of sets for which no useful or intuitive 
notion of addition exists. Consider the set 


S? = {x,y ER |x +y +2? =), 


which is, of course, nothing but the 2-D surface of a sphere in R? (see Fig. 2.2). 
This is a perfectly well-defined and intuitive set, but I know of no meaningful way 
to perform addition on this set. If you wanted to add the North Pole N = (0,0, 1) 
to the South Pole S = (0,0,—1), what should the resulting point of S 2 be? Note 
that we cannot simply add N to S as vectors in R? because the resulting point, the 
origin O, is not in S?. 


Fig. 2.2 The 2-sphere S° 

with North Pole 

N = (0,0, 1), South Pole 

S = (0,0, —1), and origin 

O = (0,0, 0). Note that 

O é S2 S 
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Thus not all sets are cut out to be vector spaces, and even though the operations 
of addition and scalar multiplication may seem trivial or obvious in the examples 
of this chapter, they really do confer a significant amount of non-trivial structure 
on a space. This structure is manifest in all the constructions and definitions one 
then proceeds to make: linear independence, span, basis sets, linear operators, dual 
spaces, etc. 

Note that none of these notions requires an inner product’; the former exist 
completely independently of the latter. With an inner product, though, one may 
then (and only then!) speak of orthogonality, length of vectors, and the like. Thus, 
inner product spaces comprise the third level of our hierarchy, illustrated again 
in Table 2.1. To be sure, once one introduces this third level of structure, there is 
then interplay between the levels (as exemplified in Exercise 2.11, which says that 
orthogonal vectors are linearly independent), but you don’t need an inner product to 
know what linear independence means. Furthermore, level three of the hierarchy 
is rich enough that for a given set V with a given vector space structure, there 
can be multiple meaningful notions of an inner product. A good example of this 


Table 2.1 Levels of structure in linear algebra. One starts with a set, whose only 
attendant notion is membership in the set. Including operations of addition and scalar 
multiplication compatible with the axioms turns a set into a vector space, and then one 
can speak of linear independence, bases, etc. Endowing the vector space with an inner 
product then yields notions of orthogonality, length, angles, etc. 


Level 1 Set (Membership) 


Addition, 
Scalar multiplication 


Vv 


tevel? Vector space con erie) 


basis, span, etc. 


Inner product (-|-) 


Level 3 Inner Product Space Gece: oo) 


unit vector, etc. 


?2We could use non-degenerate Hermitian forms here rather than inner products to make a similar 
point, but will stick with inner products for definiteness. 
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is the vector space P(R) of polynomials of one real variable, which as we saw in 
Example 2.22 and Problem 2-9 have many different inner products, all of which 
have direct applications to quantum mechanics! 

This means that one cannot ask in general whether the polynomials 2x and 4x? — 
2 are orthogonal; the question is only meaningful once an inner product on P(R) is 
specified. Once this is done, however, one can then speak of orthogonality, length, 
and the angle between vectors—notions which are absent for a vector space without 
inner product, as well as for a mere set. 


Chapter 2 Problems 


Note: Problems marked with an “x” tend to be longer, and/or more difficult, and/or 
more geared towards the completion of proofs and the tying up of loose ends. 
Though these problems are still worthwhile, they can be skipped on a first reading. 


2-1. For each notion below, determine whether it is defined on all vector spaces, 
or just those with non-degenerate Hermitian forms, or just those with an 
inner product (i.e., inner product spaces): 


a) Unit vector x for any x € V 

b) Basis 

c) Linear independence 

d) Length or norm 

e) Span 

f) Orthogonality 

g) Linear operator 

h) Angle between two vectors v and w, as defined by cos 0 = anit 
i) Dual space 

j) Metric dual 


2-2. Prove that L?([—a,a]) is closed under addition. You'll need the triangle 
inequality, as well as the following inequality, valid for all A € R: 0 < 


Sa AFI + Alg adx. 


2-3. (*) In this problem we show that {IYL} is a basis for H; (R?), which 
implies that {Yh} is a basis for Hı. We’ll gloss over a few subtleties here; 


for a totally rigorous discussion see Sternberg [19] or our discussion in 
Chap. 4. 


a) Let f € H)(R°), and write f as f = r'Y (0, $). Then we know that 
AgY = -l (l + WY. (2.46) 


If you have never done so, use the expression (2.6) for As2 and the 
expressions (2.17) for the angular momentum operators to show that 


Chapter 2 Problems 
Ag= L +L +L =.’ 


so that (2.46) says that Y is an eigenfunction of L?, as expected. The 
theory of angular momentum” then tells us that H; (R?) has dimension 
2+ 1. 

b) Exhibit a basis for H; (R?) by considering the function i = (x + iy)! 
and showing that 


LASD = lft, L+ (fd) = (Lx + iL (AD = 0. 


The theory of angular momentum then tells us that (L_)* fl = $ 
satisfies LSD =(l— k) fl and that ty to<k <72 is a basis for H; (R°). 

c) Writing fl = rY we see that Y! satisfies L?Y! = —I (1 + 1)¥! and 
LY, =m Y! as expected. Now use this definition of Y! to compute 
all the spherical harmonics for / = 1,2 and show that this agrees, up to 
normalization, with the spherical harmonics as tabulated in any quantum 
mechanics textbook. If you read Example 2.6 and did Exercise 2.2, then 
all you have to do is compute fl, 0 < k < 2and f?, 0< k < 4 and 
show that these functions agree with the ones given there. 


2-4. In discussions of quantum mechanics you may have heard the phrase 
“angular momentum generates rotations.” What this means is that if one 
takes a component of the angular momentum such as L, and exponentiates 
it, i.e. if one considers the operator 


co 


exp (-igL:) = Jo - 


n! 


(—igL)" 


n=0 
! va. eee 
=I- iġLl: + z; tpl) + zg CPL) +... 


(the usual power series expansion for e*) then one gets the operator which 
represents a rotation about the z axis by an angle ¢@. Confirm this in one 
instance by explicitly summing the power series for the operator [L;]tx,y.2} 
of Example 2.16 to get 


coso —sing 0 
exp (-i [Lks = | sing cosd 0], 
0 0 1 


the usual matrix for a rotation about the z-axis. 


3See Sakurai [17] or Gasiorowicz [7] or our discussion in Chap. 4. 
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2-5. Let V be finite-dimensional, and consider the “double dual” space (V*)*. 
Define a map 


J:V>(v*)* 


ve Jy, 
where J, acts on f € V* by 


hf) = fO). 


Show that J is linear, one-to-one, and onto. This means that we can identify 
(V*)* with V itself, and so the process of taking duals repeats itself after 
two iterations. Note that no non-degenerate Hermitian form was involved 
in the definition of J ! 


2-6. Consider a linear operator A on a vector space V. We can define a linear 
operator on V* called the transpose of A and denoted by A’ as follows: 


(A’(f))(v) = f(Av) whereveV, f eV*. 
a) If 5 is a basis for V and 6* the corresponding dual basis, show that 
[A"]s+ = [A]5- 


Thus the transpose of a matrix really has meaning; it’s the matrix 
representation of the transpose of the linear operator represented by the 
original matrix! 

b) Consider the linear operator £ on L?({—a, a]), as well as the Dirac delta 


x 


functional 6 € L?([—a,a])* defined in (2.28). Show that 


dT df 
(4 s) D=O Vie L’ ([-a, a)). 


This is the sense in which one may “differentiate the delta function.” 


2-7. a) Let A be a linear operator on a finite-dimensional, real or complex 
vector space V with inner product (-|-) . Using the transpose A? from 
the previous problem, as well as the map L : V — V* defined in 
Sect. 2.7, we can construct a new linear operator A’: this is known as 
the Hermitian adjoint of A, and is defined as 


A =L !oA OL: VOY. (2.47) 
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) 


wm 


wm 


Show that A? satisfies 
(Atv|w) = (v|Aw), (2.48) 


which is the equation that is usually taken as the definition of the adjoint 
operator. Use this to show that (A‘)* = A, and use that to show that 


(Av|w) = (v|Atw). (2.49) 


The advantage of our definition (2.47) is that it gives an interpretation to 
A’; it’s the operator you get by transplanting A’, which originally acts 
on V*, over to V via L. Note that A’ is defined using the inner product 
(-|-), and so combines information from both A and (- |-). 

Show that in an orthonormal basis {e;};=1..,, [A] = [A], where the 
dagger outside the brackets denotes the usual conjugate transpose of a 
matrix (if V is a real vector space, then the dagger outside the brackets 
will reduce to just the transpose). You may want to prove and use the 
fact Aj = e'(Ae;). 

If A satisfies A = AÏ, A is then said to be self-adjoint or Hermitian. 
Since A‘ is defined with respect to the inner product (-|-), self- 
adjointness indicates a certain compatibility between A and (-|-). Show 
that even when V is a complex vector space, any eigenvalue of A must 
be real. 

In part b) you showed that in an orthonormal basis the matrix of a Her- 
mitian operator is a Hermitian matrix. Is this necessarily true in a non- 
orthonormal basis? ( Hint: If you think this is true, you should prove it. If 
you think it isn’t, you should find a counterexample. A simple candidate 
counterexample would be the matrix of the L operator of Example 2.16 
in a non-orthonormal basis such as {(1, 0,0), (1, 1, 0), (0, 0, 1)}.) 


2-8. Let g be a non-degenerate bilinear form on a vector space V (we have in 
mind the Euclidean metric on R? or the Minkowski metric on R4). Pick an 
arbitrary (not necessarily orthonormal) basis, let [g]~! be the matrix inverse 
of [g] in this basis, and write g“” for the components of [g]~!. Also let 
fh € V*. Define a non-degenerate bilinear form g on V* by 


a(fh) = g(f, h), 


where f = L~!(f) as in Example 2.24. Show that 


gH” = gler, e”) = gi" 


so that [g]~! is truly a matrix representation of a non-degenerate bilinear 
form on V*. 
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2-9. This problem builds on Example 2.22 and further explores different bases 
for the vector space P(R), the polynomials in one variable x with real 


coefficients. 


a) Compute the matrix corresponding to the operator 4 €e £(P(R)) with 
respect to the basis B = {1, x, x7, x3,...}. 
b) Consider the inner product 


(fle) = f Ao dx 


on P(R). Apply the Gram-Schmidt process to the set S = 
{1, x, x?, x3} C B to get (up to normalization) the first four Laguerre 


Polynomials 


Lo(x) =1 
L(x) =-x+1 


1 
L(x) = om — 4x +2) 
1 3 2 
L3(x) = rae + 9x° — 18x + 6). 
These polynomials arise as solutions to the radial part of the Schrödinger 


equation for the Hydrogen atom. In this case x is interpreted as a radial 
variable, hence the range of integration (0, œo). 


Chapter 3 
Tensors 


Now that we’re familiar with vector spaces we can finally approach the main subject 
of Part I, tensors. We’ll give the modern component-free definition, from which 
will follow the usual transformation laws that used to be the definition. We’ll then 
introduce the tensor product and apply it liberally in both classical and quantum 
physics, before specializing to its symmetric and antisymmetric variants. 

As in the previous chapter, the mathematics we’ll learn will unify and hopefully 
illuminate many disparate and enigmatic topics. These include more mathematical 
objects such as the cross product, determinant, and pseudovectors, as well as 
physical constructions such as entanglement, addition of angular momenta, and 
multipole moments. 

Also, in this chapter we’ll treat mostly finite-dimensional vector spaces and 
ignore the complications and technicalities that arise in the infinite-dimensional 
case. In the few examples where we apply our results to infinite-dimensional spaces, 
you should rest assured that these applications are legitimate (if not explicitly 
justified), and that rigorous justification can be found in the literature. 


Box 3.1 Einstein Summation Convention 

From here on out we will employ the Einstein summation convention, which 
is that whenever an index is repeated in an expression, once as a superscript 
and once as a subscript, then summation over that index is implied. Thus an 
expression like v = }77_, v'e; becomes v = v'e;, and Te; = } j= 7,‘ ei 
(where T is a linear operator) becomes 


Tej = Tee (3.1) 


Any index that is summed over, either implicitly (via the Einstein convention) 
or explicitly (via a summation sign) is referred to as a dummy index. In 
contrast, any index which is not summed over is referred to as a free index. 
In both examples above 7 is a dummy index, and in (3.1) j is a free index. 
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When writing equations that hold in arbitrary bases or coordinates, free indices 
must always appear exactly once on either side, in the same position (upper 
or lower). This is very different from how dummy indices appear in such 
equations, where (since they are indices of summation) it’s perfectly legal for 
them to appear on one side only. This differing behavior of free and dummy 
indices is evident in (3.1). 

We’ll comment further on the Einstein convention in Sect. 3.2. 


3.1 Definition and Examples 


Recall that in Chap. 1 we heuristically defined a rank r tensor as a multilinear 
function that eats r vectors and produces a number. We now reiterate this definition, 
but generalize it so that tensors can eat r vectors as well as s dual vectors. We thus 
define a tensor of type (r, s) on a vector space V as a C-valued function T on 


V xe xVxXV* xe x V* 
a A 


r times s times 


which is linear in each argument, i.e. 


T (vi + cw, v2,..., Ur, fis- fs) 
= T(vi,..., Ur fi- fo) 
+cT(w,v2,..., fis-3 fs) 
and similarly for all the other arguments. This property is called multilinearity. Note 


that dual vectors are (1,0) tensors, and that vectors can be viewed as (0, 1) tensors 
as follows: 


v(f) = f(v) where ve V, f e V*. (3.2) 
Similarly, linear operators can be viewed as (1, 1) tensors as 


A(v, f) = f(Av). (3.3) 


We take (0,0) tensors to be scalars, as a matter of convention. You will show in 
Exercise 3.1 below that the set of all tensors of type (r,s) on a vector space V, 
denoted 7,’(V) or just 7%, form a vector space. This should not come as much of 
a surprise since we already know that vectors, dual vectors, and linear operators 
all form vector spaces. Also, just as linearity implies that dual vectors and linear 
operators are determined by their values on the basis vectors, multilinearity implies 
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the same thing for general tensors. To see this, let {e;};=1.., be a basis for V and 
{e'};=1.., the corresponding dual basis. Then, denoting the ith component of the 
vector v, as v'i, and the jth component of the dual vector fy as fj, repeated 


p 
application of multilinearity gives (see Box 3.2 for help with the notation) 


T(v,..-, U7, fi,---5 fs) = vi -vE fij <. fT lens.. e,n, e5) 


vi... vy fij T Ísis T F foe (3.4) 


where, as before, the numbers 


fa 
Il 


= T (ens... 6i €f, e”) (3.5) 


are referred to as the components of T in the basis {e; }i=1..n. You should check that 
this definition of the components of a tensor, when applied to vectors, dual vectors, 
and linear operators, agrees with the definitions given earlier. Also note that (3.5) 
gives us a concrete way to think about the components of tensors: they are the values 
of the tensor on the basis vectors. 


Exercise 3.1. By choosing suitable definitions of addition and scalar multiplication, show 
that 7,” (V) is a vector space. 


Box 3.2 Making Sense of Tensor Indices 

The profusion of subscripts, superscripts, and indices in equations like (3.4) 
can be quite intimidating at first. Normally, indices like i don’t have subscripts 
on them. For instance, if we wrote (3.4) for a (2,0) tensor we’d just have 


T(v,w) = v'w! Tj 


which is a fairly benign looking equation. When treating a general tensor, 
however, we must accommodate arbitrary numbers r,s of vector and dual 
vector arguments. If we have r vectors v1, V2,..., v;, then we’ll need r dummy 
indices to label their components, and since r is indefinite it’s impractical 
to choose our indices as different Roman letters i, j,k, etc. Instead, we 
adopt index subscripts and take i), i2,...,i, as the indices for v1, V2,..., Ur. 
Understanding this switch in notation should help in untangling the indices in 
equations such as (3.4). 


If we have a non-degenerate bilinear form on V, then we may change the type of 
T by precomposing with the map L or L~!. If T is of type (1,1) with components 
T,/, for instance, then we may turn it into a tensor T of type (2,0) by defining 
T(v,w) = T(v, L(w)). This corresponds to lowering the second index, and we 
write the components of T as T; j, omitting the tilde since the fact that we lowered 
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the second index implies that we precomposed with L.! This is in accord with 
the conventions in relativity, where given a vector v € R* we write v, for the 
components of v when we should really write v,,. From this point on, if we have a 
non-degenerate bilinear form on a vector space, then we permit ourselves to raise 
and lower indices at will and without comment. In such a situation we often don’t 
discuss the type of a tensor, speaking instead of its rank, equal to r + s, which 
obviously doesn’t change as we raise and lower indices. 


Example 3.1. Linear operators in quantum mechanics 


Thinking about linear operators as (1, 1) tensors may seem a bit strange, but in fact 
this is what one does in quantum mechanics all the time! Given an operator H ona 
quantum-mechanical Hilbert space spanned by orthonormal vectors {e; } (which in 
Dirac notation we would write as {|i)}), we usually write H |i} for H(e;), (j |i) for 
é;(e:) = (e;lei), and (j|H|i) for (e;|He;). Thus, (3.3) would tell us that (using 
orthonormal basis vectors instead of arbitrary vectors) 


H,’ = He, e’) 
= e/ (Hej) 
= (j|Ali), 


where we converted to Dirac notation in the last equality to obtain the familiar 
quantum-mechanical expression for the components of a linear operator. These 
components are often referred to as matrix elements, since when we write operators 
as matrices the elements of the matrices are just the components arranged in a 
particular fashion, as in (2.22). 


Example 3.2. The Levi-Civita Tensor 


Note: If you read Chap. 1, then the material in the next two examples will be 
familiar. 
Consider the (3, 0) Levi-Civita tensor € on R? defined by 


€(u,v,w) = (uXv)-w, u,v,w € R°. (3.6) 


You will check below that € really is multilinear, hence a tensor. It is well known 
from vector calculus that (u x v) - w is the (oriented) volume of a parallelepiped 
spanned by u, v, and w (see Fig. 3.1 below), so one can think of the Levi-Civita 
tensor as a kind of “volume operator” which eats three vectors and spits out the 
volume that they span. 

What about the components of the Levi-Civita tensor? If {e), e2,e3} is the 
standard basis for R°, then (3.6) yields 


'The desire to raise and lower indices at will is one reason why we offset tensor indices and write 
T(e;,e/) as T,” , rather than T/ . Raising and lowering indices on the latter would be ambiguous! 
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Fig. 3.1 The parallelepiped 
spanned by u, v, and w 


Eijk = Elei €j, ek) 
= (ei X ej) +e 
= Eijk, 
where €;;, is the usual Levi-Civita symbol. Thus the Levi-Civita symbol represents 
the components of an actual tensor, the Levi—Civita tensor! Furthermore, keeping in 
mind the interpretation of the Levi—Civita tensor as a volume operator, as well as the 


fact that the components €;;, are just the values of the tensor on the basis vectors, 
then we find that the usual definition of the Levi-Civita symbol, 


+1 if {ijk} = {1,2,3}, {2,3, 1}, or {3,1,2} 
Eijk = ) —1 if {ijk} = {3,2,1}, {1,3,2}, or {2, 1, 3} 
0 otherwise, 


is just telling us, for instance, that a parallelepiped spanned by {e1, e2,e3} has 
oriented volume +1! 


Exercise 3.2. Verify that the Levi-Civita tensor as defined by (3.6) really is multilinear. 
Exercise 3.3. Using the Euclidean metric on R? we can raise the last index on € to get a 
(2,1) tensor €, where 


Elv, w, f) =e(v,w,L '(f)) VusweVv, fev™. 


Now, just as we can interpret a (1,1) tensor as a map from V — V, we can interpret the 
(2,1) tensor € as a map a. : V x V — V via the equation €(v,w, f) = f(ae(v,w)). 
Compute œe. It should look familiar! 


Example 3.3. The Moment of Inertia Tensor 


The moment of inertia tensor, denoted Z, is the symmetric (2,0) tensor on R? which, 
when evaluated on the angular velocity vector, yields the kinetic energy of a rigid 
body, i.e. 


56 3 Tensors 


520.0) = KE. (3.7) 


Alternatively we can raise an index on Z and define it to be the linear operator which 
eats the angular velocity and spits out the angular momentum, i.e. 


L = Tø. (3.8) 


Equations (3.7) and (3.8) are most often seen in components (referred to a cartesian 
basis), where they read 


KE = $e)" Tlo] 
[L] = Ello]. 
Note that since we raise and lower indices with an inner product and usually use 


orthornormal bases, the components of Z when viewed as a (2,0) tensor and when 
viewed as a (1,1) tensor are the same, cf. Exercise 2.23. 


Example 3.4. Multipole moments 


It is a standard result from electrostatics that the scalar potential ®(r) of a charge 

distribution p(r’) localized around the origin in R? can be expanded in a Taylor 
n 2 

series as 


P(r) = = | 2 + Qin) 1 Qo(r, 4) gA oer |, 


r r? 2! r’ 3! rI 


where the Q; are ith rank tensors known as the multipole moments of the charge 
distribution p(r’). The first few multipole moments are familiar to most physicists: 
the first, Qo, is just the total charge or monopole moment of the charge distribution 
and is given by 


H= f pæ) dr 


The second, Q4, is a dual vector known as the dipole moment (often denoted as p), 
which has components 


Di = f soe yar 


The third multipole moment, Q2, is known as the quadrupole moment and has 
components given by 


Here and below we set all physical constants such as c and €o equal to 1. 
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Qij = foxes —1r'$;;) dîr. 


Notice that the Q;; are symmetric ini and j, and that X`; Q;; = 0. Analogous 
properties hold for the higher order multipole moments as well (i.e., the octopole 
moment Q3 has components Q;;, which are totally symmetric and which satisfy 
X Qiy = 0; Qij = DD; Qjii = 0). We will explain these curious features of the 
Q; at the end of this chapter. 


Example 3.5. Metric Tensors 


We met the Euclidean metric on R” in Example 2.18 and the Minkowski metric on 
Rt in Example 2.20, and it’s easy to verify that both are (2,0) tensors (why isn’t 
the Hermitian scalar product of Example 2.19 included?). We also have the inverse 
metrics, defined in Problem 2-8, and you can verify that these are (0,2) tensors. 


Exercise 3.4. Show that for a metric g on V, 


so the (1, 1) tensor associated with g (via g!) is just the identity operator. You will need the 
components g of the inverse metric, defined in Problem 2-8. 


3.2 Change of Basis 


Now we are in a position to derive the usual transformation laws that historically 
were taken as the definition of a tensor. Suppose we have a vector space V and two 
bases for V, B = {e;};=1.., and B’ = {e;’};=)..n. Since B is a basis, each of the e; 
can be expressed as 


er = Ale; (3.9) 


j Ti 
for some numbers Aj, Likewise, there exist numbers A/ (note that here the upper 
index is primed) such that 


-7 
ei = A? ej’. 
We then have 


e = Ale; = AÏ A ex (3.10) 
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and can then conclude that 
Al’ AK, = 8. (3.11) 
Considering (3.10) with the primed and unprimed indices switched also yields 


AL AY = 85, (3.12) 


so, in a way, Al l and Al, are inverses of each other. Notice that Al, and Al ‘ are not to 
be interpreted as the components of tensors, as their indices refer to different bases.* 
How do the corresponding dual bases transform? Let fel \=1.. and fe hint on be 
the bases dual to B and B’. Then the components of e’ ‘ with respect to {e’};-1. , are 


e” (ej) = e” (AK ep) = AN Sy, = Al, (3.13) 
ie. 
e” = Atel (3.14) 
Likewise, 
ef = Abe. (3.15) 


Notice how well the Einstein summation convention and our convention for priming 
indices work together in the transformation laws.* 

Now we are ready to see how the components of tensors transform. Before 
proceeding to the general case, let’s warm up by considering a (1,1) tensor with 
components T, ” . Its components in the primed basis are given by 


T} =T (ev,el) by (3.5) 
= T(Akex, Al e!) by (3.9) and (3.14) 
= A‘ Ai T (ek, e!) by multilinearity 
= AKAI T. (3.16) 


3This is also why we wrote the upper index directly above the lower index, rather than with 
a horizontal offset as is customary for tensors. For more about these numbers and a possible 
interpretation, see the beginning of the next section. 


“This convention, in which we prime the indices rather than the objects themselves, is sometimes 
known as the Schouten convention; for more on this, see Battaglia and George [3]. 
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Equation (3.16) may look familiar to you. Note that we have derived it, rather than 
taking it as the definition of a (1,1) tensor. Note also how there are two factors of the 
As, one for each argument of T . Thus the rank of the tensor determines the number 
of factors of A that occur in the transformation law. Also notice that the free indices 
i’ and j’ appear in the same position (lower and upper, respectively) on both sides 
of (3.16), in accordance with the discussion in Box 3.1. 

We can now generalize (3.16) to an arbitrary (r,s) tensor T. We’ll proceed 
exactly as above, except that we'll need r covariant indices i,...,i, and s 
contravariant indices j1,..., js as discussed in Box 3.2. We have 


“fof 

ioe i? i! 

T, P © = Dp mye, say") 
a Z 


k ky iil it ole 
= T (A; ekis +--> Aiz ek, Aj; € hece Are") 


k ky aji i l ls 
= Ar <- Ay Aj! -o ART (êk. Ek, €t, e") 


sfa e A APT a, (3.17) 


1 


Equation (3.17) is the standard general tensor transformation law, which as 
remarked above is taken as the definition of a tensor in much of the physics 
literature; here, we have derived it as a consequence of our definition of a tensor as 
a multilinear function on V and V*. The two are equivalent, however, as you will 
check in Exercise 3.5 below. 

With the general transformation law in hand, we’ll now look at specific types of 
tensors and derive their matrix transformation laws; to this end, it will be useful to 
introduce the matrices 


V yr V t jl 1 
a i seg Ay Ai Aa Sue ae 
A A A At, A A 
1 De AS = 1’ ye 1 
AN oe ge p asje. ” (3.18) 
Al AD, 1. An ii Ay oats A”, 
By virtue of (3.11) and (3.12), these matrices satisfy 
AA! = AA =I (3.19) 
as our notation suggests. 
Exercise 3.5. Consider a function which assigns to a basis {e;};=1,., a set of numbers 


{Tk hi3} Which transform according to (3.17) under a change of basis. Use this 
assignment to define a multilinear function T of type (r,s) on V, and be sure to check 
that your definition is basis independent (i.e., that the value of T does not depend on which 
basis {e;};=1.., you choose). 
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Fig. 3.2 The standard basis 


e2 
B and a new one gray B’ ” 
obtained by rotation through 
an angle ġ 


Example 3.6. Change of basis matrix for a 2-D rotation 


As a simple illustration of the formalism, consider the standard basis B in R? and 
another basis 5’ obtained by rotating B by an angle 0. This is illustrated in Fig. 3.2. 
By inspection we have 


ey = cos 0 e; + sin 0 e2 
(3.20) 
ey = — sin 0 e} + cos 0 ez 


and so by (3.9) we have 


Al, =cos0 A} =-—sin0 


A =sin@ A>, = cos0. 


Equation (3.18) then tells us that 
A= cos 6 — sin 0 
~ \sin@ cos J’ 


The numbers Al "and the corresponding matrix A can be computed by either 
inverting A~! or equivalently by inverting the system (3.20) and proceeding as 
above. E 


Example 3.7. Vectors and Dual Vectors 


Given a vector v [considered as a (0, 1) tensor as per (3.2)], Eq. (3.17) tells us that 
its components transform as 


sf 


v” = At y! (3.21) 
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while the components of a dual vector f transform as 
fu = AL fy. (3.22) 


Notice that the components of v transform with the At whereas the basis vectors 


transform with the Al, so the components of a vector obey the law opposite 
(‘contra’) to the basis vectors. This is the origin of the term “contravariant.” Note 
also that the components of a dual vector transform in the same way as the basis 
vectors, hence the term “covariant’”.° It makes sense that the basis vectors and the 
components of a vector should transform oppositely; v exists independently of any 
basis for V and shouldn’t change under a change of basis, so if the e; change one 
way, the vi should change oppositely. Similar remarks apply to dual vectors. 


Box 3.3 More on the Einstein Summation Convention 
We can now explain a little bit more about the Einstein summation convention. 
We knew ahead of time that the components of dual vectors would transform 
like basis vectors, so we gave them both lower indices. We also knew that the 
components of vectors would transform like dual basis vectors, so we gave 
them both upper indices. Since the two transformation laws are opposite, we 
know (see below) that a summation over an upper index and lower index will 
yield an object that does not transform at all, so the summation represents an 
object or a process that doesn’t depend upon a choice of basis. For instance, 
the expression vře; represents the vector v which is defined without reference 
to any basis, and the expression f; v’ is just f(v), the action of the functional f 
on the vector v, which is also defined without reference to any basis. Processes 
such as these are so important and ubiquitous that it becomes very convenient 
to omit the summation sign for repeated upper and lower indices, and we thus 
have the summation convention. 

What if we have two repeated upper indices or two repeated lower indices? 
In these cases we sometimes require summation and sometimes not, so we 
choose to indicate summation explicitly. An example where summation is 
required is (3.26) below; an example where summation is not desired is given 
by Euler’s equations of rigid body motion,® which are sometimes written 
collectively as 


Ti = Ài®i + Yeisen; Or, (3.23) 
jk 


where Tt is the torque, À; are the eigenvalues of the moment of inertia tensor (the 
so-called principal moments of inertia), and w is the angular velocity vector. 


5For a more detailed and complete discussion of covariance and contravariance, see Fleisch [5]. 
See [8] for details. 
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In the first term on the right-hand side of (3.23) summation over i is not desired 
since į is actually a free index, as evidenced by the left-hand side. 

In both (3.23) and (3.26) the repeated upper or lower indices arise from the 
use of a particular basis; in (3.26), it’s assumed that the basis is orthonormal, 
and in (3.23) it’s assumed that the basis is orthonormal and that the axes 
are eigenvectors of the moment of inertia tensor (the so-called principal 
axes). Such assumptions are usually the case when upper or lower indices are 
repeated, and one should then note that the resulting equations do not hold in 
arbitrary bases. Furthermore, if the repeated indices are summed over, then 
that summation represents an invariant process only when the accompanying 
assumption is satisfied. 


Returning to our discussion of how components of vectors and dual vectors 
transform, we can write (3.21) and (3.22) in terms of matrices as 


[v]e = Alu] (3.24a) 
[fle =A" a. (3.24b) 


where the superscript T again denotes the transpose of a matrix. From Box 3.3, we 
know that f(v) is basis-independent, but we also know that f(v) = [/]%[v]s. This 
last equation then must be true in any basis, and we can in fact prove this using 
(3.24): in a new basis B’, we have 


Le bls = (47's) Alls 
= [f] 4" Alv]g (3.25) 
= [flslvls- 


This makes concrete our claim above that [ f] transforms “oppositely” to [v], so 
that the basis-independent object f(v) really is invariant under a change of basis. 

Before moving on to our next example we should point out a minor puzzle: you 
showed in Exercise 2.23 that if we have an inner product (-|-) on a real vector 
space V and an orthornormal basis {e;};=1,.,, then the components of vectors and 
their corresponding dual vectors are identical, which is why we were able to ignore 
the distinction between them for so long. Equation (3.24) seems to contradict 
this, however, since it looks like the components of dual vectors transform very 
differently from the components of vectors. How do we explain this? Well, if we 
change from one orthonormal basis to another, we have 


birjr = (eirlej”) = AVA‘, (erler) = X` ALA‘, (3.26) 
k=1 
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which in matrices reads 


so we must have 


1 JA! = A". 


Such matrices are known as orthogonal matrices, and we see here that a trans- 
formation from one orthornormal basis to another is always implemented by an 
orthogonal matrix. ” For such matrices (3.24a) and (3.24b) are identical, resolving 
our contradiction. 

Incidentally, for a complex inner product space you will show that orthonormal 
basis changes are implemented by matrices satisfying A~! = AŤ. Such matrices are 
known as unitary matrices and should be familiar from quantum mechanics. 


Exercise 3.6. Show that for any invertible matrix A, (A~!)7 = (A7)™|, justifying the 
sloppiness of our notation above. 


Exercise 3.7. Show that for a complex inner product space V, the matrix A implementing 
an orthonormal change of basis satisfies A~! = A’. 


Example 3.8. Linear Operators 
We already noted that linear operators can be viewed as (1,1) tensors as per (3.3). 
Equation (3.17) then tells us that, for a linear operator T on V, 
j/ k gl'p l 
T” = Ay, Ay T; 
which in matrix form reads 


[Tle = A[T]g A7! (3.27) 


which is the familiar similarity transformation of matrices. It just says that to 
compute the action of T in the primed basis, you use AT! to convert your column 
vector [v] from the new basis back to the old; then, you operate with T via [T]s; 
then, you convert back to the new basis using A. 

Incidentally, the similarity transformation (3.27) allows us to extend the trace 
functional of Example 2.17 from xn matrices to linear operators as follows: Given 
T € L(V) and a basis 5 for V, define the trace of T as 


Tr(T) = Tr([T]g). 


You can then use (3.27) to show (see Exercise 3.10) that Tr(T) does not depend on 
the choice of basis B. 


7See Problem 3-1 for more on orthogonal matrices, as well as Chap. 4. 
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Exercise 3.8. Show that for v € V, f € V*, T € L(V), f(Tv) = [f]" [T][v] is invariant 
under a change of basis. Use the matrix transformation laws as we did in (3.25). 


Exercise 3.9. Let B = {x,y,z}, B! = {56 + iy), z-z — iy)} be bases for 
Hı (R?), and consider the operator L, for which matrix expressions were found with respect 


to both bases in Example 2.16. Find the numbers At and Aj, and use these, along with 
(3.27), to obtain [L;]g from [L;]g. 


Exercise 3.10. Show that (3.27) implies that Tr([T]g) does not depend on the choice of 
basis B, so that Tr(T) is well defined. 
Example 3.9. (2,0) Tensors 


(2,0) tensors g, which include important examples such as the Minkowski metric 
and the Euclidean metric, transform as follows according to (3.17): 


k al 
Sir jt = Ap Aj Sxl 
or in matrix form 


[gle = A" [g]p AT. (3.28) 


Notice that if g is an inner product and 6 and B’ are orthonormal bases then 
[gle = [g]e = I and (3.28) becomes 


feas" 


again telling us that A must be orthogonal. Also note that if A is orthogonal, (3.28) is 
identical to (3.27), so we don’t have to distinguish between (2,0) tensors and linear 
operators (as most of us haven’t in the past!). In the case of the Minkowski metric 
n we aren’t dealing with an inner product but we do have orthonormal bases, with 
respect to which® y takes the form 


100 0 
010 0 
001 0 
000-1 


[n] = 


so if we are changing from one orthonormal basis to another we have 


8We assume here that the basis vector e, satisfying 7(e;, er) = —1 is the fourth vector in the basis, 
which isn’t necessary but is somewhat conventional in physics. 
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100 0 100 0 
010 0 = got 010 0 Ac! 
001 0 001 0 
000-1 000-1 
or equivalently 
100 0 100 0 
010 0 r{ 90100 
=A A. 3.29 
001 0 001 0 ee) 
000-1 000-1 


Matrices A satisfying (3.29) are known as Lorentz Transformations. Notice that 
these matrices are not quite orthogonal, so the components of vectors will transform 
slightly differently than those of dual vectors under these transformations. This is 
in contrast to the case of R” with a positive-definite metric, where if we go from 
one orthonormal basis to another then the components of vectors and dual vectors 
transform identically, as you showed in Exercise 2.23. E 


Exercise 3.11. As in previous exercises, show using the matrix transformation laws that 
g(v, w) = [w]" [g][v] is invariant under a change of basis. 


3.3 Active and Passive Transformations 


Before we move on to the tensor product, we have a little unfinished business to 
conclude. In the last section when we said that the A} " were not the components of 
a tensor, we were lying a little; there is a related tensor lurking around, namely the 
linear operator U that takes the new basis vectors into the old, i.e. U(e;) = e; Vi 
(the action of U on an arbitrary vector is then given by expanding that vector in the 
basis 6’ and using linearity). What are the components of this tensor? Well, in the 
old basis 6 we have 


U,/ =U(ei,e/) = e’ (Uei) =e! (U(f ex) = AP e! (Uler) = Af e! (ex) = AP 
(3.30) 

so the Al actually are the components of a tensor?! Why did we lie, then? Well, 

the approach we have been taking so far is to try and think about things in a basis- 


independent way, and although U is a well-defined linear operator, its definition 
depends entirely on the two bases we’ve chosen, so we may as well work directly 


°If the sleight-of-hand with the primed and unprimed indices in the last couple steps of (3.30) 
bothers you, puzzle it out and see if you can understand it. It may help to note that the prime on an 
index doesn’t change its numerical value; it’s just a reminder that it refers to the primed basis. 
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Table 3.1 Summary of active vs. passive transformations 


leile = Alei|s Active 
[v]e = A[v]g Passive 


with the numbers that relate the bases. Also, using one primed index and one 
unprimed index makes it easy to remember transformation laws like (3.14) and 
(3.15), but is not consistent with our notation for the components of tensors. 

If we write out the components of U as a matrix, you should verify that 


[eile = [U]g[e:r]g = Alei|s (3.31) 


which should be compared to (3.24a), which reads [v]g; = A[v]g. Equation (3.31) 
is called an active transformation, since we use the matrix A to change one vector 
into another, namely e; into e;. Note that in (3.31) all vectors are expressed in the 
same basis. Equation (3.24a), on the other hand, is called a passive transformation, 
since we use the matrix A not to change the vector v but rather to change the basis 
which v is referred to, hence changing its components. All this is summarized in 
Table 3.1. 

The notation in most physics texts is not as explicit as ours; one usually sees 
matrix equations like 


r= Ar (3.32) 


for both passive and active transformations, and one must rely on context to figure 
out how the equation is to be interpreted. In the active case, one considers the 
coordinate system fixed and interprets the matrix A as taking the physical vector 
r into a new vector r’, where the components of both are expressed in the same 
coordinate system, just as in (3.31). In the passive case, the physical vector r doesn’t 
change but the basis does, so one interprets the matrix A as taking the components 
of r in the old coordinate system and giving back the components of the same vector 
r in the new (primed) coordinate system, just as in (3.24a). All this is illustrated in 
Fig. 3.3. 

Before we get to some examples, note that in the passive transformation (3.24a) 
the matrix A takes the old components to the new components, whereas in the 
active transformation (3.31) A takes the new basis vectors to the old ones. Thus 
when A is interpreted actively it corresponds to the opposite transformation as in 
the passive case. This dovetails with the fact that components and basis vectors 
transform oppositely, as discussed under (3.22). 


Example 3.10. Active and passive orthogonal transformations in two dimensions 
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ez ez 


êi r êi 


Fig. 3.3 Illustration of the passive and active interpretations of r’ = Ar dimensions. In (a) we 
have a passive transformation, in which the same vector r is referred to two different bases. The 
coordinate representation of r transforms as r’ = Ar, though the vector itself does not change. 
In (b) we have an active transformation, where there is only one basis and the vector r is itself 
transformed by r’ = Ar. In the active case the transformation is opposite that of the passive case 


Let B = {e1,@2} be the standard basis for IR?, and consider a new basis B’ 
given by 


1 1 
er Sey Fe 
a 2” 

1 1 
ey = 


——~e, + ——&2. 
B p“ 


You can show (as in Example 3.6) that this leads to an orthogonal change of basis 
matrix given by 


a a 
A= Nee (3.33) 


Vi Vi 


which corresponds to rotating our basis counterclockwise by ¢ = 45°, see Fig. 3.3a. 
Now consider the vector r = pid + Te also depicted in the figure. In the 


standard basis we have 

a 

2/2 
I 


2/2 


[r]s = 


What does r look like in our new basis? From Fig. 3.3a we see that r is proportional 
to el , and that is indeed what we find; using (3.24a), we have 
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dite te ns 
(re = Altls = (4 a Ge = ( 5) (3.34) 
V2 V2) \ 2/2 


as expected. Remember that the column vector at the end of (3.34) is expressed in 
the primed basis. 

This was the passive interpretation of (3.32); what about the active interpretation? 
Taking our matrix A and interpreting it as a linear operator represented in the 
standard basis we again have 


fs = Alr]s = ~ a 


except that now the vector (1/2, 0) represents the new vector r’ in the same basis B. 
This is illustrated in Fig. 3.3b. As mentioned above, when A is interpreted actively, 
it corresponds to a clockwise rotation, opposite to its interpretation as a passive 
transformation. O 


Exercise 3.12. Verify (3.33) and (3.34). 
Example 3.11. Active transformations and rigid body motion 


Passive transformations are probably the ones encountered most often in classical 
physics, since a change of cartesian coordinates induces a passive transformation. 
Active transformations do crop up, though, especially in the case of rigid body 
motion. In this scenario, one specifies the orientation of a rigid body by the time- 
dependent orthogonal basis transformation A(t) which relates the space frame K’ 
to the body frame K(t) (we use here the notation of Example 2.12). As we saw 
above, there corresponds to the time-dependent matrix A(t) a time-dependent linear 
operator U(t) which satisfies U(t)(e;,) = e;(t). If K and K’ were coincident at 
t = 0 and ro is the position vector of a point p of the rigid body at that time (see 
Fig. 3.4a), then the position of p at a later time is just r(t) = U(t)ro (see Fig. 3.4b), 
which as a matrix equation in K’ would read 


rOl = AH lro]. (3.35) 
In more common and less precise notation this would be written 
r(t) = A(t)ro. 


In other words, the position of a specific point on the rigid body at an arbitrary time 
t is given by the active transformation corresponding to the matrix A(t). E 


Example 3.12. Active and passive transformations and the Schrödinger and 
Heisenberg pictures 
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Time t=0 ; 
Time t 


Fig. 3.4 In (a) we have the coincident body and space frames at £ = 0, along with the point p of 
the rigid body. In (b) we have the rotated rigid body, and the vector r(t) pointing to point p now 
has different components in the space frame, given by (3.35) 


The duality between passive and active transformations is also present in quantum 
mechanics. In the Schrödinger picture, one considers observables like the momen- 
tum or position operator as acting on the state ket while the basis kets remain fixed. 
This is the active viewpoint. In the Heisenberg picture, however, one considers the 
state ket to be fixed and considers the observables to be time-dependent (recall that 
(2.18) is the equation of motion for these operators). Since the operators are time- 
dependent, their eigenvectors (which form a basis!) are time-dependent as well, so 
this picture is the passive one in which the vectors don’t change but the basis does. 
Just as an equation like (3.32) can be interpreted in both the active and passive sense, 
a quantum-mechanical equation like 


< X(t) > = (yI (U2 U ) |y) (3.36a) 
= ((w|U") £ (U|W)), (3.36b) 


where U is the time-evolution operator for time f, can also be interpreted in two 
ways: in the active sense of (3.36b), in which the U’s act on the vectors and change 
them into new vectors, and in the passive sense of (3.36a), where the U’s act on the 
operator £ by a similarity transformation to turn it into a new operator, X(t). 


10For details on why the eigenvectors of Hermitian operators form a basis, at least in the finite- 
dimensional case, see Hoffman and Kunze [13]. 
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3.4 The Tensor Product: Definition and Properties 


Now that we are familiar with tensors and their transformation laws, it is time to 
introduce the tensor product. The tensor product is one of the most basic operations 
with tensors and is commonplace in physics, but is often unacknowledged or, at 
best, dealt with in an ad-hoc fashion. Before we give the precise definition, which 
takes a little getting used to, we give a rough, heuristic description. Given two finite- 
dimensional vector spaces V and W (over the same set of scalars C), we would like 
to construct a product vector space, which we denote V & W, whose elements are 
in some sense “products” of vectors v € V and w € W. We denote these products 
by v @ w. This product, like any respectable product, should be bilinear in the sense 
that 


(vj +2) Q@w =v, @wt+ vw @w (3.37a) 
v Q (wi +w) = V@w, + V@ w2 (3.37b) 
c(v 8 w) = (cv) @w=v@(cw), ceEC. (3.37c) 


Given these properties, the product of any two arbitrary vectors v and w can then 
be expanded in terms of bases {e;};=1.., and { fj} j=1..m for V and W as 


vQ@w= (viei) ® (Ww f;) 
=v'wie ® f; 


so {e; ® fj},i =1...n, j = 1...m should be a basis for V & W, which would 
then have dimension nm. Thus the basis for the product space would be just the 
product of the basis vectors, and the dimension of the product space would be just 
the product of the dimensions. 

Now let’s make this precise. Given two finite-dimensional vector spaces V and 
W, we define their tensor product V © W to be the set of all C-valued bilinear 
functions on V* x W*. Such functions do form a vector space, as you can easily 
check. This definition may seem unexpected or counterintuitive at first, but you will 
soon see that this definition does yield the vector space described above. Also, given 
two vectors v € V, w E€ W, we define their tensor product v & w to be the element 
of V @ W defined as follows: 


(v @ w)(h, g) =h(v)w(g) VheV*, gew*. (3.38) 


(Remember that an element of V @ W is a bilinear function on V* x W*, and so 
is defined by its action on a pair (h, g) € V* x W*). The bilinearity of the tensor 
product is immediate and you can probably verify it without writing anything down: 
just check that both sides of Eq. (3.37) are equal when evaluated on any pair of dual 
vectors. To prove that {e; & fj},i = 1...n, j = 1...misa basis for V & W, 
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let fe! Keim {f j }i=1...m be the corresponding dual bases and consider an arbitrary 
T €e V @ W. Using bilinearity, 


T(h,g) = higjT (e, fI) = hig; T” (3.39) 
where TY = T(e', f/). If we consider the expression T” e; ® f;, then 


(THe; ® fe’, JD = Mele FP) 
= 75,5, 
z TK 


so Te; ® fj agrees with T on basis vectors, hence on all vectors by 
bilinearity, so T = T” e; @ fj. Since T was an arbitrary element of V @ W, 
V @ W = Span {e; ® fj}. Furthermore, the e; & f; are linearly independent as 
you should check, so {e; ® fj} is actually a basis for V & W and V ® W thus has 
dimension mn. 

The tensor product has a couple of important properties besides bilinearity. First, 
it commutes with taking duals, that is 


(V @W)* =V* @W*. 


Secondly, and more importantly, the tensor product it is associative, i.e. for vector 
spaces V;, i = 1, 2,3, 


(Vi @ V2) @ V3 = Vi 8 V 8 V3). 


This property allows us to drop the parentheses and write expressions like 
Vi ® --- ® V, without ambiguity. One can think of Vi ® --- ® V, as the set of 
C -valued multilinear functions on V* x --- x V,*. 

These two properties are both plausible, particularly when thought of in terms 
of basis vectors, but verifying them rigorously turns out to be slightly tedious. See 
Warner [21] for proofs and further details. 

Exercise 3.13. If {e;}, {,f;}, and {gg} are bases for Vj, V2, and V3 respectively, convince 


yourself that {e; ® fj ® gr} is a basis for V1 @ V2 @ V3, and hence that dim V; 8 V28 V3 = 
nin2n3 where dim V; = n;. Extend the above to n-fold tensor products. 


3.5 Tensor Products of V and V* 


In the previous section we defined the tensor product for two arbitrary vector spaces 
V and W. Often, though, we’ll be interested in just the iterated tensor product of a 
vector space and its dual, i.e. in tensor products of the form 
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V* @---@V*QV@::-@V. (3.40) 
Se ~ 
r times s times 


This space is of particular interest because it is actually identical to 77! From the 
previous section we know that the vector space in (3.40) can be interpreted as the 
set of multilinear functions on!! 


VxexVxV* x.x V%, (3.41) 
— a m 
r times s times 


but these functions are exactly 77! Since the space in (3.40) has basis B’ = {e" @ 
-Qe @e;, Q-Q e;,}, we can conclude that B" is a basis for 77. In fact, we 
Ads then 


claim that if T € 7,’ has components T; .;, 


T =T, p Ne" @-+- Oe Be, @- e (3.42) 


is the expansion of T in the basis 6". To prove this, we just need to check that both 
sides agree when evaluated on an arbitrary set of basis vectors; on the left-hand side 
we get T(e;,,...,€;,,e7,...,e) = T, _/'-/5 by definition, and on the right- 
hand side we have 


Bils g j is 
Tok, se De DeF Be, Oee ei e,n, e7) 
lils „k kr j is 
= Tae, 73E (en)... e (Gi, Jen (e")...e1,(e*) 
= Ly ..1s eki ky of js 
=T il eee Gy 8), EEA 
=T, Fes (3.43) 
so our claim is true. Thus, for instance, a (2,0) tensor like the Minkowski metric 
can be written as n = nuwe” & e”. Conversely, a tensor product like f & g = 


figje’ Qel € Te thus has components (f & g)i; = figj. Notice that we now 
have two ways of thinking about components: 


1. As the values of a tensor on sets of basis vectors, as in (3.5) 
2. As the expansion coefficients in a given basis, as in (3.42) 


This duplicity of perspective was pointed out in the case of vectors just above 
Exercise 2.12, and it’s essential that you be comfortable thinking about components 
in either way. 


Exercise 3.14. Compute the dimension of 77. 


11 Actually, to interpret (3.40) as the space of multilinear functions on (3.41) also requires the fact 
that (V*)* œ V. See Problem 2-5 for a proof of this. 
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Exercise 3.15. Let T, and T, be tensors of type (r1, 51) and (r2, 52) respectively on a vector 
space V. Show that Ti ® T can be viewed as an (r; + r2, s1 + s2) tensor, so that the tensor 
product of two tensors is again a tensor, justifying the nomenclature. 


One important operation on tensors which we can now discuss is that of 
contraction, which is the generalization of the trace functional to tensors of arbitrary 
rank: Given T € 7,’ (V) with expansion 


T =T 


Jieds ai ir 

ie Me B+ @ eh DeD Bey, (3.44) 
we can define a contraction of T to be any (r —1, s—1) tensor resulting from feeding 
e! into one of the arguments, e; into another, and then summing over i as implied by 
the summation convention. For instance, if we feed e; into the rth slot and e! into 
the (r + s)th slot and sum, we get the (r — 1, s — 1) tensor T defined as 


T (vi... 0 fie ha) = = T(V,- U1, 6i Foon f1). 


t 


You may be suspicious that T depends on our choice of basis, but Exercise 3.16 
shows that contraction is in fact well defined. Notice that the components of T are 


Ji-js—1 — ` ' Ji--js—ıl 
Tipi =: Ty ipl Š 


l 


(The summation in the previous two equations is not necessary, since we are 
assuming the Einstein summation convention, but is written explicitly for emphasis.) 
Similar contractions can be performed on any two arguments of T provided one 
argument eats vectors and the other dual vectors. In terms of components, a 
contraction can be taken with respect to any pair of indices provided that one is 
covariant and the other contravariant. If we are working on a vector space equipped 
with a metric g, then we can use the metric to raise and lower indices and so can 
contract on any pair of indices, even if they’re both covariant or contravariant. For 
instance, we can contract a (2,0) tensor T with components 7;; as T = i = 
8" T;;, which one can interpret as just the trace of the associated linear operator (or 
(1,1) tensor). For a linear operator or any other rank 2 tensor, this is the only option 
for contraction. If we have two linear operators A and B, then their tensor product 
AQ B € T? has components 


4 ; 
(4 8 B); = A; B, 


and contracting on the first and last index gives a (1,1) tensor AB whose 
components are 


(AB)! =A; B;'. 


74 3 Tensors 


You should check that this tensor is just the composition of A and B, as our notation 
suggests. What linear operator do we get if we consider the other contraction 
A; Bi? 


Exercise 3.16. Show that if {e;};=1,,, and {e;/};=1,, are two arbitrary bases that 
T(U1,---5 Vp 19 Gis Asoo es fre) = T (v. Uris ein fises Praise) 
so that contraction is well defined. 
Example 3.13. V* & V 


One of the most important examples of tensor products of the form (3.40) is V* @V, 
which as we mentioned is the same as 7,!, the space of linear operators. How does 
this identification work, explicitly? Well, given f & v € V* ® V, we can define a 
linear operator by (f ® v)(w) = f(w)v. More generally, given 


Tie! Qe; EV*@YV, (3.45) 
we can define a linear operator T by 
T(v) = T,/ el (ve; = viT, ej 


which is identical to (2.20). This identification of V* @ V and linear operators is 
actually implicit in many quantum-mechanical expressions. Let H be a quantum- 
mechanical Hilbert space and let y, € H so that L(¢) € H*. The tensor product 
of L(ġ) and y, which we would write as L(ġ) © y, is written in Dirac notation 
as |W)(@| (note the transposition of the factors relative to our convention). If we’re 
given an orthonormal basis 6 = {|i)}, the expansion (3.45) of an arbitrary operator 
H can be written in Dirac notation as 


H = J | H; li) (il, 
ij 


an expression which may be familiar from advanced quantum mechanics texts.!? In 
particular, the identity operator can be written as 


I=} liil, 


which is referred to as the resolution of the identity with respect to the basis {|i )}. 
A word about nomenclature: In quantum mechanics and other contexts the tensor 
product is often referred to as the direct or outer product. This last term is meant 


We don’t bother here with index positions since most quantum mechanics texts don’t employ 
Einstein summation convention, preferring instead to explicitly indicate summation. 
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to distinguish it from the inner product, since both the outer and inner products eat 
a dual vector and a vector (strictly speaking the inner product eats 2 vectors, but 
remember that with an inner product we may identify vectors and dual vectors) but 
the outer product yields a linear operator whereas the inner product yields a scalar. 


Exercise 3.17. Interpret e ® e; as a linear operator, and convince yourself that its matrix 
representation is 


le! &e;] = Ej. 


Recall that E ;; is one of the elementary basis matrices introduced way back in Example 2.9, 
and has a 1 in the jth row and ith column and zeros everywhere else. 


3.6 Applications of the Tensor Product in Classical Physics 


Example 3.14. Moment of inertia tensor revisited 


We took an abstract look at the moment of inertia tensor in Example 3.3; now, 
armed with the tensor product, we can examine the moment of inertia tensor more 
concretely. Consider a rigid body with a fixed point O, so that it has only rotational 
degrees of freedom (O need not necessarily be the center of mass). Let O be the 
origin, pick time-dependent body-fixed axes K = {x(ft), y(t), z(t)}, and let g denote 
the Euclidean metric. Recall that g allows us to define a map L from vectors to dual 
vectors. Also, let the ith particle in the rigid body have mass m; and position vector 
r; with [r;]x = (xi, yi, zi) relative to O, and let r = g(r;,r;). This is illustrated in 
Fig. 3.5. The (2,0) moment of inertia tensor is then given by 


Loo = Yo m: (r?g — Llr) 8 Lr) (3.46) 


while the (1, 1) tensor reads 

Ta = > mir? — LQ) 8 r;). (3.47) 
You should check that in components (3.46) reads 

Tik = X mj (178 jx — (ri); i)a). 


Writing a couple of components explicitly yields 
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Fig. 3.5 The rigid body with 
fixed point O and body-fixed 
axes K = {x,y,z}, along 
with ith particle at position r; 
with mass m; 


Ly = =X mixiyi. (3.48) 


expressions which should be familiar from classical mechanics. So long as the basis 
is orthonormal, the components Z fa of the (1, 1) tensor in (3.47) will be the same 
as for the (2, 0) tensor, as remarked earlier. Note that if we had not used body-fixed 
axes, the components of r; [and hence the components of Z, by (3.48)] would in 
general be time-dependent; this is the main reason for using the body-fixed axes in 
computation. 


Example 3.15. Maxwell Stress Tensor 


In considering the conservation of total momentum (mechanical plus electromag- 
netic) in classical electrodynamics one encounters the symmetric rank 2 Maxwell 
Stress Tensor, defined in (2,0) form as! 


1 
Ton =EQE+B8B- (E-E+B-B)g, 


where E and B are the dual vector versions of the electric and magnetic field 
vectors. T can be interpreted in the following way: T (v, w) gives the rate at which 
momentum in the v-direction flows in the w-direction. In components we have 


1 
T;; = E;E; + B;B; — -(E- E + B- B)ô;, 
J J J 2 J 


which is the expression found in most classical electrodynamics textbooks. 


‘Recall that we’ve set all physical constants such as c and €o equal to 1. 
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Example 3.16. The Electromagnetic Field tensor 


As you have probably seen in discussions of relativistic electrodynamics, the 
electric and magnetic field vectors are properly viewed as components of a rank 
2 antisymmetric tensor F, the electromagnetic field tensor.'4 To write F in 
component-free notation requires machinery outside the scope of this text,!> so we 
settle for its expression as a matrix in an orthonormal basis, which in (2, 0) form is 


0 —B, B, -Ex 


B, 0 -B,-E, 
F, = i i |, 3.49 
[Feo] -B, B, 0 =E; (3.49) 
E Ey E, 0 
The Lorentz force law 
dp" 
“= gF# v”, 
dt qe 


where p = mv is the four-momentum of a particle, v is its four-velocity t its proper 
time, and q its charge, can be rewritten without components as 


dp 
—=qF 3.50 
74 a.p (v) (3.50) 
which just says that the Minkowski force @ on a particle is given by the action of 
the field tensor on the particle’s 4-velocity! 


3.7 Applications of the Tensor Product in Quantum Physics 


In this section we’ll discuss further applications of the tensor product in quantum 
mechanics, in particular the oft-unwritten rule that to add degrees of freedom one 
should take the tensor product of the corresponding Hilbert spaces. Before we get to 
this, however, we must set up a little more machinery and address an issue that we’ve 
so far swept under the rug. The issue is that when dealing with spatial degrees of 
freedom, as opposed to “internal” degrees of freedom like spin, we often encounter 
Hilbert spaces like L?([—a,a]) and L?(R) which are most conveniently described 
by “basis” vectors which are eigenvectors of either the position operator £ or the 


'4Tn this example and the one above we are actually not dealing with tensors but with tensor fields, 
i.e. tensor-valued functions on space and spacetime. For the discussion here, however, we will 
ignore the spatial dependence, focusing instead on the tensorial properties. 


5One needs the exterior derivative, a generalization of the curl, divergence and gradient operators 
from vector calculus. See Schutz [18] for a very readable account. 
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momentum operator p. The trouble with these bases is that they are often non- 
denumerably infinite (i.e., can’t be indexed by the integers, unlike all the bases we’ve 
worked with so far) and, what’s worse, the “basis vectors” don’t even belong to 
the Hilbert space! Consider, for example, L? (R). The position operator £ acts on 
functions y(x) € L?(R) by 


X W(x) = x w(x). (3.51) 


If we follow the practice of most quantum mechanics texts and treat the Dirac delta 
functional ô as L(6(x)) where (x), the “Dirac delta function,” is infinite at 0 and 0 
elsewhere, you can check (see Exercise 3.18) that 


2 8(x — xo) = xo (x — xo) 


so that 6(x — xo) is an “eigenfunction” of £ with eigenvalue xo (in Dirac notation we 
write the corresponding ket as |xo)). The trouble is that, as we saw in Example 2.25, 
there is no such 5(x) € L?(R)! Furthermore, since the basis {6(x — xo)}x,er is 
indexed by R and not some subset of Z, we must expand y € L?(R) by integrating 
instead of summing. Integration, however, is a limiting procedure and one should 
really worry about what it means for an integral to converge. Rectifying all this in 
a rigorous manner is possible,!° but outside the scope of this text, unfortunately. 
We do wish to work with these objects, however, so we will content ourselves with 
the traditional approach: ignore the fact that the delta functions are not elements of 
L?(R), work without discomfort with the basis {8(x — x0)}x)er,/’ and fearlessly 
expand arbitrary functions y in the basis {6(x — xo)}x er as 


w(x) = i dx' y(x s(x — x’), (3.52) 


where the above equation can be interpreted both as the expansion of w and just the 
definition of the delta function. In Dirac notation (3.52) reads 


iW = if dx! WOx')|x’). (3.53) 


Note that we can think of the numbers y(x) as the components of |y} with respect 
to the basis {|x)},er. Alternatively, if we define the inner product of our basis 
vectors to be 


(x|x’) = 8 — x’) 


16This requires the so-called rigged Hilbert space; see Ballentine [2]. 


Working with the momentum eigenfunctions e’?* instead doesn’t help; though these are 
legitimate functions, they still are not square-integrable since f ie jei?*|? dx = œœ ! 
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as is usually done, then using (3.53) we have 


yx) = (xl) (3.54) 


which gives another interpretation of y(x). These two interpretations of w(x) are 
just the ones mentioned below (3.43); that is, the components of a vector can be 
interpreted either as expansion coefficients, as in (3.53), or as the value of a given 
dual vector on the vector, as in (3.54). 


Exercise 3.18. Check/review the following properties of the delta function: 
(a) By considering the integral 


f (& 8(x — x0) FO) dx 


(where f is an arbitrary square-integrable function), show formally that 


Xx Slx — xo) = xo (x — xo). 


(b) Check that {8(x — xo)} xer satisfies (2.36). 
(c) Verify (3.54). 


While we're at it, let’s pose the following question: we mentioned in a footnote 
above that one could use momentum eigenfunctions instead of position eigenfunc- 
tions as a basis for L? (R); what does the corresponding change of basis look like? 


Example 3.17. The Momentum Representation 


As is well known from quantum mechanics, the eigenfunctions of the momentum 


operator p = —i 4 are the wavefunctions {e'?*} per, and these wavefunctions form 
a basis for L*(R). In fact, the expansion of an arbitrary function y € L?(R) in this 


basis is just the Fourier expansion of y, written 


1 “2 iy 
wey =5- | doe, (3.55) 


where the component function ¢ (p) is known as the Fourier transform! of Y. One 
could in fact work exclusively with ¢ (p) instead of y(x), and recast the operators £ 
and p in terms of their action on ¢ (p) (see Exercise 3.19 below); such an approach 


'8The Fourier transform of a function y(x) is often alternatively defined as 


pa . 

ory = f deyo, 
—co 

which in Dirac notation would be written (p|yw). These two equivalent definitions of (p) are 

totally analogous to the two expressions (3.52) and (3.54) for y(x), and are again just the two 

interpretations of components discussed below (3.43). 
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is known as the momentum representation. Now, what does it look like when we 
switch from the position representation to the momentum representation, i.e. when 
we change bases from {6(x — xo)}x,er to {e!?*} eR? Since the basis vectors are 
indexed by real numbers p and xo as opposed to integers i and j, our change of basis 
will not be given by a matrix with components At but rather a function A(Xo, p). 
Using the fact that both bases are orthonormal and assuming that (3.13) extends to 
infinite dimensions, A(xo, p) is given by the inner product of 5(x — xo) and e'?*. In 
Dirac notation this would be written as (xo|p), and we have 


(xolp) = J dx 8(x — xo)eP™ = e'?*, 


This may be a familiar equation, but can now be interpreted as a formula for the 
change of basis function A(xo, p). Oo 


Exercise 3.19. Use (3.55) to show that in the momentum representation, P ¢(p) = p ¢(p) 
and x @(p) = iz. 


The next issue to address is that of linear operators: having constructed a new 
Hilbert space!® Hı ® Hz out of two Hilbert spaces Hı and H2, can we construct 
linear operators on Hı ® H2 out of the linear operators on Hı and H2? Well, given 
linear operators A; on H;, i = 1,2, we can define a linear operator A; ® A2 on 
Hı ® H2 by 


(A; ® A2)(v Q w) = (A1v) Q (Aw). (3.56) 


You can check that with this definition, (A & B)(C @ D) = AC ® BD. In most 
quantum-mechanical applications either A; or A2 is the identity, i.e. one considers 
operators of the form A; ® I or J @ A2. These are often abbreviated as A; and A2 
even though they’re acting on Hı H2. We should also mention here that the inner 
product (-| -)g on Hı ® Hz is just the product of the inner products on (-| -); on the 
H;i, that is 


(vi ® v2|W1 @ wW2)e = (v1 |wi)1 + (v2|w2)2. 


The last subject we should touch upon is that of vector operators, which are 
defined to be sets of operators that transform as three-dimensional vectors under the 
adjoint action of the total angular momentum operators J;. That is, a vector operator 
is a set of operators { B;};=1-3 (often written collectively as B) that satisfies 


19You may have noticed that we defined tensor products only for finite-dimensional spaces. The 
definition can be extended to cover infinite-dimensional Hilbert spaces, but the extra technicalities 
needed do not add any insight to what we’re trying to do here, so we omit them. The theory of 
infinite-dimensional Hilbert spaces falls under the rubric of functional analysis, and details can be 
found, for example, in Reed and Simon [15]. 
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3 
ady,(Bj) = (Ji, Bj] = i > ex Br, (3.57) 
k=1 


where €;;, is the familiar Levi-Civita symbol. The three-dimensional position 
operator f = {X, f, 2}, momentum operator p = {Px, Py, Êz}, and orbital angular 
momentum operator L = {L,, Ly, Lz} are all vector operators, as you can check. 


Exercise 3.20. For spinless particles, J = L = fx p. Expressions for the components may 
be obtained by expanding the cross product or referencing Example 2.16 and Exercise 2.11. 
Use these expressions and the canonical commutation relations [x;, pj] = iô;; to show that 
r, p, and L are all vector operators. 


Now we are finally ready to consider some examples, in which we’ll take as an 
axiom that adding degrees of freedom is implemented by taking tensor products 
of the corresponding Hilbert spaces. You will see that this process reproduces 
familiar results. 


Example 3.18. Addition of translational degrees of freedom 


Consider a spinless particle constrained to move in one-dimension; the quantum- 
mechanical Hilbert space for this system is L7(IR) with basis {|x)},er. If we 
consider a second dimension, call it the y dimension, then this degree of freedom 
has its own Hilbert space L?(R) with basis {|y)})er. If we allow the particle both 
degrees of freedom, then the Hilbert space for the system is L*(R) & L? (R), with 
basis {|x) Q |y}}x. yer. An arbitrary ket |y} € L?(R) @ L?(R) has expansion 


Iv) =f a dx dy W(x, y) |x) ® |y) 


with expansion coefficients w(x, y). If we iterate this logic, we get in three- 
dimensions 


Iv) -S ff dx dy dzy (æ, y, |x) @ |) 8 lð. 


If we rewrite (x, y,z) as Y (r) and |x) @ |y) 8 |z} as |r) where r = (x, y, z), then 
we have 


jjj = f dr plr) 


which is the familiar expansion of a ket in terms of three-dimensional position 
eigenkets. Such a ket is an element of L? (R) & L? (R) & L? (R), which is also 
denoted as L?(IR*).”° 


20 7, (R3) is actually defined to be the set of all square-integrable functions on R?, i.e. functions f 
satisfying 
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Example 3.19. Two-particle systems 


Now consider two spinless particles in three-dimensional space, possibly interacting 
through some sort of potential. The two-body problem with a 1/r potential is a 
classic example of this. The Hilbert space for such a system is then L? (R?) @ 
L? (R°), with basis {|r1) ® |r2) r;e- In many textbooks the tensor product symbol 
is omitted and such basis vectors are written as |r1) |r2), or even |r}, 12). A ket |y) 
in this Hilbert space then has expansion 


lv) = f di f Tarteen 


which is the familiar expansion of a ket in a two-particle Hilbert space. One can 
interpret Y (r1, r2) as the probability amplitude of finding particle 1 in position rı 
and particle 2 in position r simultaneously. 


Example 3.20. Addition of orbital and spin angular momentum 


Now consider a spin s particle in three-dimensions. As remarked in Example 2.2, 
the ket space corresponding to the spin degree of freedom is C™+!, and one usually 
takes a basis {|m} }—s<m<s of S; eigenvectors with eigenvalue m. The total Hilbert 
space for this system is L? (R?) @ C%+!, and we can take as a basis {|r) ® |m)} 
where r € R? and —s < m < s. Again, the basis vectors are often written as |r) |) 
or even |r, m). An arbitrary ket |y) then has expansion 


lv) = > J Pryn), 


m>=—s 


where Ym(r) is the probability of finding the particle at position r and with m units 
of spin angular momentum in the z-direction. These wavefunctions are sometimes 
written in column vector form 


Ws 
Ws -1 


West 1 
Ws 


The total angular momentum operator J is given by L @ J + J @S, where L is the 
orbital angular momentum operator. One might wonder why J isn’t given by L@S; 


co lo e) lo) 
f / / dx dy dz|f |? < oo. 
—Co J—-CO U—CO 


Not too surprisingly, this space turns out to be identical to L?(R) ® L?(R) 8 L?(R). 
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there is a good answer to this question, but it requires delving into the (fascinating) 
subject of Lie groups and Lie algebras, which we postpone until Part II. In the 
meantime, you can get a partial answer by checking (Exercise 3.21 below) that 
the operators L; ® S; don’t satisfy the angular momentum commutation relations 
whereas the L; ® J + I ® S; do. 


Exercise 3.21. Check that 


3 


[Li QI +18 Si L; @1+1@Si) =>) ej (Le @1+1@ Sy). 


k=1 
Also show that 


3 
[Li @ Si, Lj SF D> eije Li ® Sr- 


k=1 
Be sure to use the bilinearity of the tensor product carefully. 
Example 3.21. Addition of spin angular momentum 


Next consider two particles of spin sı and s2 respectively, fixed in space so 
that they have no translational degrees of freedom. The Hilbert space for this 
system is C%I+! @ C22+!, with basis {|m,) ® |m2)} where —s; < m;i < 
Si, 1 = 1,2. Again, such tensor product kets are usually abbreviated as |m,)|m2) 
or |mı, m2}. There are several important linear operators on C+! @ C?%2*1; 


S&I Vector spin operator on first particle 
I8S Vector spin operator on second particle 
S=S,®/J+J/@8) Total vector spin operator 

S? =>, SS; Total spin squared operator 


(Why aren’t S} and SZ in our list above?) The vectors |m1, m2) are clearly 
eigenvectors of S1; and S2; and hence S, [we abuse notation as mentioned below 
(3.57)] but, as you will show in Exercise 3.22, they are not necessarily eigenvectors 
of S?. However, since the S; obey the angular momentum commutation relations (as 
you can check), the general theory of angular momentum tells us that we can find a 
basis for C+! @ C*2*! consisting of eigenvectors of S, and S*. Furthermore, it 
can be shown that the S? eigenvalues that occur are s(s + 1) where 


s = |si — 52], |s} — So) +1, ... , 5) +2 (3.58) 


and for a given s the possible S, eigenvalues are m where —s < m < s as usual 
(see Example 5.23 for further discussion of this). We will write these basis kets 
as {|s,m)} where the above restrictions on s and m are understood, and where 
we physically interpret {|s,m)} as a state with total angular momentum equal to 
y s(s + 1) and with m units of angular momentum pointing along the z-axis. We 
then have two natural and useful bases for C™! +! @ C7241; 


B = {|m,,m2)} =s <M <s, =s <m <S 
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B' = {s,m} |s s| <5 <5, +52, -s<m<s. 


What does the transformation between these two bases look like? Well, by their 
definition, the A}, relating the two bases are given by e’ (e;); using s, m collectively 
in lieu of the primed index and mı, mz collectively in lieu of the unprimed index, 
we have, in Dirac notation, 


Asm? = (mı, m|s, m). (3.59) 


These numbers, the notation for which varies widely throughout the literature, are 
known as Clebsch-Gordan Coefficients, and methods for computing them can be 
found in any standard quantum mechanics textbook (e.g., Sakurai [17]). 

Let us illustrate the foregoing with an example. Take two spin | particles, so that 
Sı = 92 = 1. The Hilbert space for the first particle is C?, with S1; eigenvector basis 
{|-1) , 0) , |1)}, and so the two-particle system has nine-dimensional Hilbert space 
C? Q C? with corresponding basis 


B = ili) |j) |i, j =—1,0, 5 
= {11) 11), 11) 10), |1)|-1), [0) |1), ete}. 


There should also be another basis consisting of S, and S? eigenvectors, however. 
From (3.58) we know that the possible s values are s = 0, 1,2, and it is a standard 
exercise in angular momentum theory to show that the nine (normalized) S, and S? 
eigenvectors are 


2.1) = CH) 10) + 10) 11)) 

2.0) = = (11) 1-1) + 210) 10) + |-1) 11) 

2.1) = = (1-1) 10) + 10) 1-1)) 

12, -2) = |-1)|-1) 3.60) 
1,1) = (11) 10) — 10) 11) 

1,0) = (11) 1-1) = 1-1) 11) 

1,1) = = (10) 11) = 1-1) 10) 

0,0) = = 0) 10} = 11) =1) = 1) 11)) 
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These vectors can be found using standard techniques from the theory of “addition 
of angular momentum”; for details, see Gasiorowicz [7] or Sakurai [17]. The 
coefficients appearing on the right-hand side of the above equations are precisely 
the Clebsch—Gordan coefficients (3.59), as a moment’s thought should show. 


Exercise 3.22. Show that 


S =S QI +18 S +2) Su @ Sx. 


i 


The right-hand side of the above equation is usually abbreviated as S? + S + 2S; - S2. Use 
this to show that |, m2) is not generally an eigenvector of S?. 


Example 3.22. Entanglement 


Consider two Hilbert spaces Hı and H3 and their tensor product Hı ® H2. Only 
some of the vectors in Hı ® H2 can be written as Y @ ¢; such vectors are referred 
to as separable states or product states. All other vectors must be written as linear 
combinations of the form ys Wi © ¢;, and these vectors are said to be entangled, 
since in this case the measurement of the degrees of freedom represented by Hı 
will influence the measurement of the degrees of freedom represented by H2. The 
classic example of an entangled state comes from the previous example of two fixed 
particles with spin; taking sı = s2 = 1/2 and writing the standard basis for C? as 
{|+),|—)}, we consider the particular state 


Py I=) A) (3.61) 


If an observer measures the first particle to be spin up, then a measurement of the 
second particle’s spin is guaranteed to be spin-down, and vice-versa, so measuring 
one part of the system affects what one will measure for the other part. This is the 
sense in which the system is entangled. For a product state Y @ @¢, there is no such 
entanglement: a particular measurement of the first particle cannot affect what one 
measures for the second, since the second particle’s state will be ¢ no matter what. 
You will check below that (3.61) is not a product state. 


Exercise 3.23. Prove that (3.61) cannot be written as  ® @ for any y, ¢ € C?. Do this by 
expanding y and ¢ in the given basis and showing that no choice of expansion coefficients 
for y and ¢ will yield (3.61). 


3.8 Symmetric Tensors 


Given a vector space V there are certain subspaces of 7 (V) and 7,°(V) which 
are of particular interest: the symmetric and antisymmetric tensors. We’ll discuss 
symmetric tensors in this section and antisymmetric tensors in the next. A symmet- 
ric (r,0) tensor is an (r, 0) tensor whose value is unaffected by the interchange (or 
transposition) of any two of its arguments, that is 


T (U1, . 065 Vip ee Ujyee ey Ur) = T Wiper Vj,. 66, Uj, +++ 5 Ur) 
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for any i and j. Symmetric (0, r) tensors are defined similarly. You can easily check 
that the symmetric (r, 0) and (0, r) tensors each form vector spaces, denoted S’(V*) 
and S"(V) respectively. For T € S’(V%*), the symmetry condition implies that the 
components 7;,_;, are invariant under the transposition of any two indices, hence 
invariant under any rearrangement of the indices (since any rearrangement can be 
obtained via successive transpositions). Similar remarks apply, of course, to S’(V). 
Notice that for rank 2 tensors, the symmetry condition implies 7;; = T;; so that 
[T]g for any B is a symmetric matrix. Also note that it doesn’t mean anything to 
say that a linear operator is symmetric, since a linear operator is a (1, 1) tensor and 
there is no way of transposing the arguments. One might find that the matrix of a 
linear operator is symmetric in a certain basis, but this won’t necessarily be true in 
other bases. If we have a metric to raise and lower indices, then we can, of course, 
speak of symmetry by turning our linear operator into a (2, 0) or (0, 2) tensor. 


Example 3.23. S?(R*) 


Consider the set {e! @ e!, e? @e?, e! @e? +e? @e!} C S?(R™*) where fe!};=1 9 is 
the standard dual basis. You can check that this set is linearly independent, and that 
any symmetric tensor can be written as 


T = Tye! Q e! + Tne? Q e + Tole! @e? +e? ge!) (3.62) 


so this set is a basis for S? (R?*), which is thus three-dimensional. In particular, the 
Euclidean metric g on R? can be written as 


g=e@e+el@e 


since 211 = 222 = l and giz = go = O. Note that g would not take this simple 
form in a non-orthonormal basis. Oo 


Exercise 3.24. Let V = R” with the standard basis B. Convince yourself that 
[ei Be +e; 8 eilg = Sij, 
where S;; is the symmetric matrix defined in Example 2.9. 


There are many symmetric tensors in physics, almost all of them of rank 2. Many 
of them we’ve met already: the Euclidean metric on R?, the Minkowski metric on 
Rt, the moment of inertia tensor, and the Maxwell stress tensor. You should refer to 
the examples and check that these are all symmetric tensors. We have also met one 
class of higher rank symmetric tensors: the multipole moments. 


Example 3.24. Multipole moments and harmonic polynomials 


Recall from Example 3.4 that the scalar potential ®(r) of a charge distribution p(r’) 
localized around the origin in R? can be expanded in a Taylor series in 1/r as 
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(3.63) 


O(r) = = | a O1(r) Jk 1 Qo(r,r) i 1 Q;(r,r,r) al 


r r? 2! r’ 3! r! 


where the Q; are the symmetric rank / multipole moment tensors. Each symmetric 
tensor Q; can be interpreted as a degree / polynomial f;, just by evaluating on / 
copies of r = (x!, x7, x°?) as indicated: 


fi) = Qi(r,... 0) = Dijon x?! +++ x", (3.64) 


where the indices 7; are the usual component indices, not exponents. Note that the 
expression x!!---x!! in the right-hand side is invariant under any rearrangement 
of the indices i;. This is because we fed in / copies of the same vector r into 
Qı. This fits in nicely with the symmetry of Q;,...;,. In fact, the above equation 
gives a one-to-one correspondence between /th rank symmetric tensors and degree 
l polynomials; we won’t prove this correspondence here, but it shouldn’t be too 
hard to see that (3.64) turns any symmetric tensor into a polynomial, and that, 
conversely, any fixed degree polynomial can be written in the form of the right-hand 
side of (3.64) with Q;,...;, symmetric. This (roughly) explains why the multipole 
moments are symmetric tensors: the multipole moments are really just fixed degree 
polynomials, which in turn correspond to symmetric tensors. 

What about the tracelessness of the Q), i.e. the fact that X} Qi,...4.-k-i, = 0? 
Well, ®(r) obeys the Laplace equation A®(r) = 0, which means that every term in 
the expansion (3.63) term is of the form 


fi) 


pati’ 


If we write the polynomial f; (r) as r! Y(6, #) then a quick computation shows that 
Y(0, @) must be a spherical harmonic of degree /, and hence f; must be a harmonic 
polynomial! Expanding f;(r) in the form (3.64) and applying the Laplacian then 
shows that if f; is harmonic, then Q; must be traceless. E 


Exercise 3.25. What is the polynomial associated with the Euclidean metric tensor 
g= ya ei ® e'? What is the symmetric tensor in S? (R?) associated with the polyno- 
mial x? y? 


Exercise 3.26. Substitute f; (r) = rY (0, $) into the equation 


Aw) 
A (43 =p 
and show that Y(0,ġ) must be a spherical harmonic of degree /, and thus that f; is 
harmonic. Then use (3.64) to show that if f; is a harmonic polynomial, then the associated 
symmetric tensor Q; must be traceless. If you have trouble showing that Q; is traceless for 
arbitrary l, try starting with the / = 2 (dipole) and / = 3 (octopole) cases. 
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3.9 Antisymmetric Tensors 


Now we turn to antisymmetric tensors. An antisymmetric (or alternating) (r,0) 
tensor is one whose value changes sign under transposition of any two of its 
arguments, i.e. 


T (isses Vises SU js eee Ur) SST Wigs ces Viens Disses Ur) (3.65) 
Again, antisymmetric (0, r) tensors are defined similarly and both sets form vector 
spaces, denoted A” V* and A’V (forr = 1 we define A'V* = V* and A'V = V). 


The following properties of antisymmetric tensors follow directly from (3.65); the 
first one is immediate, and the second two you will prove in Exercise 3.27 below: 


1. T (vi, ...; vy) = 0 if v; = vj for any i Æ j 
=> 2.T(v,..., v) = Oif {y,..., v, } is linearly dependent 


=> 3.IfdimV = n, then the only tensor in A”V* and A’V for r > n is the 0 tensor. 


An important operation on antisymmetric tensors is the wedge product: Given 
f.g € V* we define the wedge product of f and g, denoted f A g, to be the 
antisymmetric (2, 0) tensor defined by 


fAgs=fSgs-gef. (3.66) 


Note that f A g =—g A f, and that f A f = 0. Expanding (3.66) in terms of the 
e' gives 


fag=figj(e' Qe -ei @e')= figje re! (3.67) 


so that {ef A e/};<; spans all wedge products of dual vectors (note the “i < j” 
stipulation, since e A e/ and e/ Ae! are not linearly independent). In fact, you will 
check in Exercise 3.28 that {e A e/};<; is linearly independent and spans A?V*, 
hence is a basis for A7V*. The wedge product can be extended to r-fold products 
of dual vectors as follows: given r dual vectors f\,..., f+, we define their wedge 
product fı A---A f, to be the sum of all tensor products of the form ff, 8-8 fi, 
where each term gets a + or a — sign depending on whether an odd or an even 
number”! of transpositions of the factors are necessary to obtain it from f,®---® fr; 
if the number is odd the term is assigned —1, if even a +1. Thus, 


2!The number of transpositions required to get a given rearrangement is not unique, of course, 
but hopefully you can convince yourself that it’s always odd or always even. A rearrangement 
which always decomposes into an odd number of transpositions is an odd rearrangement, and 


3.9 Antisymmetric Tensors 89 


frA=fASh-heh (3.68a) 
AAAAA=ASBheAt+tA@BsAehAtA®SAeh 
—fEehefi-hefhe®fh-f®f® fr (3.68b) 


and so on. You should convince yourself that fell Art A eh mei, is a basis 
for A"V* (see Example 3.25 below for examples of this). Note that this entire 
construction can be carried out for vectors as well as dual vectors. Also note that 
all the comments about symmetry above Example 3.23 apply here as well. 


Exercise 3.27. Let T E€ A’V™*. Show that if {v),..., v,} is a linearly dependent set then 
T(v,..., vy) = 0. Use the same logic to show that if {fi,..., fr} C V* is linearly 
dependent, then fı A--:A f, = 0. If dim V = n, show that any set of more than n vectors 
must be linearly dependent, so that A"V = A’V* = 0 forr >n. 


Exercise 3.28. Prove that {e A e/};<; is linearly independent by evaluating an arbitrary 
linear combination on an arbitrary tensor product e, ® e;. Also prove that {ef A e/};< j 
spans A?V*, and hence is a basis for it. You can do this with an argument analogous to that 
used in and below (3.39). 


Example 3.25. Antisymmetric tensors on R? 


The algebra of antisymmetric tensors can be a bit intimidating at first, so it may 
help to warm up with a basic example. By referring to the discussion above, you 
should convince yourself that the rank r antisymmetric tensors on R? have bases as 
follows: 


A'R?* = R** has basis fet, e?, e?) 


A?R** has basis fe! NE, ENE, elne k 


A?R** has basis fe! Ene \ : 


See (3.68) for the expansions of the above wedge products in terms of tensor 
products. 


Exercise 3.29. Expand the (2,0) electromagnetic field tensor of (3.49) in the basis {ef Ae/} 
where i < j andi, j = 1,2,3,4. 


Exercise 3.30. Let dim V = n. Show that the dimension of A’V* and A’V is 
n n! 
(") = (n—r)!r!* 


Example 3.26. Identical particles 


In quantum mechanics we often consider systems which contain identical particles, 
i.e. particles of the same mass, charge, and spin. For instance, we might consider n 
non-interacting hydrogen atoms moving in a potential well, or two electrons of an 
helium atom orbiting around their nucleus. In such cases we would assume that the 


even rearrangements are defined similarly. We’ll discuss this further in Chap. 4, specifically in 
Example 4.25. 
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total Hilbert space Hio would be just the n-fold tensor product of the single particle 
Hilbert space H. This is problematic, however. Consider two identical particles in a 
box, one with state |y} and another with state |ø}. Then the above logic implies that 
the composite state of the two particles is given by |Y) 8 |) € Hor = H 8 H. The 
trouble here is that this state is physically indistinguishable from |¢) ®|y), since we 
have no way of labeling the particles so that we know which one is the “first” and 
which the “second.” This makes H & H mathematically redundant as a description 
of the composite system. 

To formalize the issue we can introduce the linear “permutation operator” P on 
Hot, which just switches the factors in the tensor product, i.e. 


P(Y) 8 l9) =|) 8 |v). (3.69) 


Since such a permutation is neither physically executable nor observable, it seems 
reasonable to require that any vector v € Hio representing a physical state should 
be invariant under P, at least up to a phase, so that Pv and v represent the same 
quantum state. In other words, a physical state v should be an eigenvector of P ! But 
by (3.69) it’s clear that P? = J, which when applied to the eigenvector v tells you 
that Pv = +v. This means that for our two particles in a box, our options are 


1V) 8 |) + 14) @lW)) € SH) 
1Y) 8 1d) - l) 8 |W)) € AH), 


where the P eigenvalues are +1 and —1, respectively. But, which of these states 
should we take? The amazing empirical fact is that nature takes advantage of both 
states, for different particles. For certain particles (known as bosons) only the 
state in S?(H) would be observed, while for other particles (known as fermions) 
only the state in A? would be observed. For n-particle systems, bosons would 
similarly only occupy states in S”(H) while fermions would only live in A”#H. 
This restriction of the total Hilbert space to either S"(H) or A”H is known as 
the symmetrization postulate.” It is an empirical fact that all known particles are 
fermions or bosons, in accordance with the symmetrization postulate. 

All this has far-reaching consequences. For instance, if we have two fermions, 
we cannot measure the same values for a complete set of quantum numbers for both 
particles, since then the state would have to include a term of the form |W) |y} and 
thus couldn’t belong to A7H. This fact that two fermions can’t be in the same state is 


??In relativistic quantum field theory, as opposed to non-relativistic quantum mechanics, this fact is 
no longer an additional postulate but rather an internally deducible fact, known as the spin-statistics 
theorem. The spin-statistics theorem furthermore states that bosons have integer spin and fermions 
have half-integral spin. See Zee [24] for a discussion and further references. It is also possible to 
deduce the symmetrization postulate from our assumption that 1-particle states are invariant under 
permutation operators such as P from (3.69); proving this requires group theory, however, and so 
is postponed to Sect. 5.3. 
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known as the Pauli Exclusion Principle. As another example, consider two identical 
spin 1/2 fermions fixed in space, so that Hig. = A2C?. A2C? is one-dimensional 


with basis vector 
1 1 1\|1 
|0,0) = —~)-—J- ; 
2 2 2112 


where we have used the notation of Example 3.21. If we measure S? or S, for this 
system we will get 0. This is in marked contrast to the case of two distinguishable 
spin 1/2 fermions; in this case, the Hilbert space is C? @ C? and we have additional 


possible state kets 
1\|1 
11,1) =]=)J= 
2/ | 2 


v=) He 
ra 


which yield nonzero values for S? and S-. E 

The next three examples are a little more mathematical than physical but they are 
necessary for the discussion of pseudovectors in the next section. Hopefully you’ll 
also find them of interest in their own right. 


Example 3.27. The Levi-Civita tensor 


Consider R” with the standard inner product. Let {e;};=1,,, be an orthonormal basis 
for R” and consider the tensor 


e=elar---Ae"le A"R™, 


You can easily check that 


O if {i),...,d,} contains a repeated index 
Eiin = $| —1 if {i1,...,i,} is an odd rearrangement of {1,...,7} 
+1 if {i),...,i,} is an even rearrangement of {1,..., n}. 


Forn = 3, we saw in Example 1.1 (and you can also check in Exercise 3.31) that €;;, 
has the same values as the Levi-Civita symbol, and so € here is an n-dimensional 
generalization of the three-dimensional Levi-Civita tensor we introduced in Exam- 
ples 1.1 and 3.2. As in those examples, € should be thought of as eating n vectors 
and spitting out the n-dimensional volume spanned by those vectors. This can be 
seen explicitly for n = 2 also. Considering two vectors u and v in the x — y plane, 
we have 
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Fig. 3.6 €(u, v) is just the eee 
(signed) area of the K 
parallelogram formed by u 7 elua) 7 
and v , 
z 
/ 
y 
u 
e(u, v) = eju vi 
= u*v” — u” v” 
= (ux v)* 


and we know that this last expression can be interpreted as the (signed) area of the 
parallelogram spanned by u and v; See Fig. 3.6. 

Finally, note that A”R”* is one-dimensional, and that € is the basis for it 
described under (3.68b). 

You may object that our construction of € seems to depend on a choice of metric 
and orthonormal basis. The former is true: € does depend on the metric, and we make 
no apologies for that. As to whether it depends on a particular choice of orthonormal 
basis, we must do a little bit of investigating; this will require a brief detour into the 
subject of determinants. 


Exercise 3.31. You may have seen the values of €;;, defined in terms of cyclic and anti- 
cyclic permutations.” The point of this exercise is to make the connection between that 
definition and ours, and to see to what extent that definition extends to higher dimensions. 


(a) Check that the € tensor on R? satisfies 


+1 if {i, j, k} is a cyclic permutation of {1, 2, 3} 
€ijk = ) —1 if {i, j, k} is an anticyclic permutation of {1, 2, 3} 
0 otherwise. 


Thus for three indices the cyclic permutations are the even rearrangements and the anti- 
cyclic permutations are the odd ones. 

(b) Consider € on R4 with components €;;,;. Are all rearrangements of {1, 2, 3, 4} necessarily 
cyclic or anti-cyclic? 

(c) Is it true that €;;,; = 1 if {i, j, k, 1} is a cyclic permutation of {1, 2, 3, 4}? 


Example 3.28. The determinant 


You have doubtless encountered determinants before, and have probably seen them 
defined iteratively; that is, the determinant of a 2 x 2 square matrix A, denoted |A| 
(or det A), is defined to be 


3A cyclic permutation of {1,..., n} is any rearrangement of {1,..., n} obtained by 
successively moving numbers from the beginning of the sequence to the end. That is, 
PAET n, 1}, {3,..., n, 1,2}, and so on are the cyclic permutations of {1,..., n}. Anti-cyclic 


permutations are cyclic permutations of {n,n —1,..., 1}. 
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|A| = Ay, Ax. — Ann (3.70) 
and then the determinant of a 3 x 3 matrix B is defined in terms of this, i.e. 


Boy Bo3 
B\=B 7 3.71 
i H | Bs B33 n 


= 0] 


This expression is known as the cofactor expansion of the determinant, and is not 
unique; one can expand about any row (or column), not necessarily (B11, B12, B13). 


In our treatment of the determinant we will take a somewhat more sophisticated 
approach.?”* Take an n x n matrix A and consider its n columns as n column vectors 


in R”, ie. 
Aj A2 EER An 
A=). 4 2 E 


Thus, the first column vector A; has ith component A;;, and so on. Then, 
constructing the € tensor using the standard basis and inner product on R”, we define 
the determinant of A, denoted |A| or det A, to be 


|A| = €(A1,..., An) (3.72) 


or in components 


[AL = >> ei. Anas Ain: (3.73) 


You should check explicitly that this definition reproduces (3.70) and (3.71) for 
n = 2,3. You can also check in the Problems that many of the familiar properties of 
determinants (sign change under interchange of columns, invariance under addition 
of rows, factoring of scalars) follow quite naturally from the definition and the 
multilinearity and antisymmetry of e€. 

Since the determinant is defined in terms of the epsilon tensor, which has 
an interpretation in terms of volume, then perhaps the determinant also has an 
interpretation in terms of volume. Consider our matrix A as a linear operator on 
R”; then, A sends the standard orthonormal basis {e1,..., €n} to a new, potentially 
non-orthonormal basis {Ae,,..., Aen}. If {e1,..., €n} spans a regular n-cube whose 
volume is €(€),...,@,) = 1, then the vectors {Ae),..., Aen} span a skewed n-cube 
with volume given by €(Ae,..., Aen). To evaluate this volume, recall from Box 2.4 


>4For a complete treatment, however, you should consult Hoffman and Kunze [13], Chap. 5. 
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Vol=1 Vol = det A 


Fig. 3.7 The action of A on the standard cube in R*. The determinant of A is just the volume of 
the skew cube spanned by {Ae , Aez, Ae3} 


that Ae, is nothing but Aj, the first column of A; thus, by the definition (3.72), the 
volume is just det A! We thus conclude that: 


The determinant of a matrix A is the (oriented) volume of the skew n-cube 
obtained by applying A to the standard n-cube. 


This is illustrated for n = 3 in Fig. 3.7. 

You may have noticed that this volume can be negative, which is why we called 
the determinant an oriented (or signed) volume; the interpretation of this is given in 
the next example. 


Example 3.29. Orientations and the € tensor 


Note: This material is a bit abstract and may be skipped on a first reading. 


With the determinant in hand we may now explore to what extent the definition of 
€ depends on our choice of orthonormal basis. Consider another orthonormal basis 
{ey = Alej }. If we define an ¢’ in terms of this basis, we find 


Fs Fd 
ef =e! A. Ae" 


4 $a . 
= Ai, eae ae 
1 I . 
= Aj. < Ái, evine! Aces A e" 
= |Ale, (3.74) 
where in the third equality we used the fact that if e"! A - - -^ e? doesn’t vanish it can 
always be rearranged to give e! A--- Ae”, and any resulting sign change is accounted 


for by the Levi-Civita symbol. Now since both {e;} and {e; } are orthonormal bases, 
A must be an orthogonal matrix. We can then use the product rule for determinants 
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o 
te 


N> 
N> 


so, 
<, 


<> 


y 
Right-handed bases Left-handed bases 


Fig. 3.8 The two orientations of R°. The upper-left most basis is usually considered the standard 
basis 


|AB| = |A||B| (see Problem 3-4 for a simple proof) and the fact that |A7| = | A| to 
get 


1 = |I| =|AA™| = |Aj|A7| = | A)? 


which implies |A| = +1. Thus by (3.74) €’ = «€ if the two orthonormal bases used 
in their construction are related by an orthogonal transformation A with |A| = 1; 
such a transformation is called a rotation,” and two bases related by a rotation, or 
by any transformation with |A| > 0, are said to have the same orientation. If two 
bases are related by a basis transformation with |A| < 0, then the two bases are said 
to have the opposite orientation. We can then define an orientation as a maximal*° 
set of bases all having the same orientation, and you can show (see Problem 3-6) that 
R” has exactly two orientations. In R? these two orientations are the right-handed 
bases and the left-handed bases, and are depicted schematically in Fig. 3.8. Thus we 
can say that € doesn’t depend on a particular choice of orthonormal basis, but it does 
depend on a metric and a choice of orientation, where the orientation chosen is the 
one determined by the standard basis. Oo 


25No doubt you are used to thinking about a rotation as a transformation that preserves distances 
and fixes a line in space (the axis of rotation). This definition of a rotation is particular to R3, since 
even in R? a rotation can’t be considered to be “about an axis” since z ¢ IR. For the equivalence 
of our general definition and the more intuitive definition in R?, see Goldstein [8]. 


26i e., could not be made bigger. 
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The notion of orientation allows us to understand the interpretation of the 
determinant as an “oriented” volume: the sign of the determinant just tells us 
whether or not the orientation of {Ae;} is the same as {e;}. Also, for orientation- 
changing transformations on R° one can show that A can be written as A = Ao(—J), 
where Ap is a rotation and —/ is referred to as the inversion transformation. 
The inversion transformation plays a key role in differentiating vectors from 
pseudovectors, which are the subject of the next section. 


Box 3.4 Rotations vs. Translation 

In the last example we introduced rotations as orthogonal transformations with 
unit determinant. We will have much more to say about rotations in Part II of 
this book, where they will be the prototype for Lie groups and Lie algebras. For 
now, though, it may be worth contrasting rotations with translations, which are 
the other familiar symmetry of 3-D Euclidean space. Translation by a vector 
w € R? is just a map 


T, : R? > R? 


v =e v +w. 


The main contrast with rotations is that whereas rotations can be associated 
with linear operators (as per our discussion in Sect. 3.3), translations are non- 
linear maps; that is, they do not satisfy the linearity condition (2.16), as you can 
check. This means their action on R? cannot be expressed in terms of matrix 
multiplication. 

We will not have much more to say about translations, though we will 
discuss the relationship between translations in position and momentum space 
in Example 4.38. 


3.10 Pseudovectors 


Note: All indices in this section refer to orthonormal bases. The calculations and 
results below do not apply to non-orthonormal bases, though they can be generalized 
to such. 

A pseudovector (or axial vector) is a tensor on R? whose components transform 
like vectors under rotations but don’t change sign under inversion. Common 
examples of pseudovectors are the angular velocity vector w, the magnetic field 
vector B, as well as all cross products, such as the angular momentum vector 
L = r x p. It turns out that pseudovectors like these are actually elements of A?R° , 
which are known as bivectors. 

To see the connection, consider the wedge product of two vectors r, p € R3. This 
looks like 
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#4 2 3 1 2 3 
rAp=(ret+rer+ree3)A (p ei + per + pes) (3.75) 
=(r'p—rpyeret (rp! —r'pyesre+ (rp r’ penne. 
This looks just like r x p if we make the identifications 


NA — 63 
e3 ^e — e (3.76) 


e ^e — @. 


In terms of matrices, this corresponds to the identification?” 


(3.77) 


This identification can be embodied in a one-to-one and onto map 
J: YR >R? 
defined as follows. If œ € A?°R?, then we can expand it as 
a= ae, Aez + ae, Aei + ae, A e. 


We then define J (œ) via its components (J(a))! as 


(J(a))' = Ld pat. (3.78) 


You will check below that this definition really does give the identifications 
written above. Note that J is essentially just a contraction with the epsilon tensor. 
With this, we see that r x p is really just J(r A p)! Thus: 


Cross products are essentially just bivectors. 


Exercise 3.32. Check that J, as defined by (3.78), acts on basis vectors as in (3.76). Also 
check that when written in terms of matrices, J produces the map (3.77). 


Exercise 3.33. We can use now give a simple derivation of the BAC-CAB rule of vector 
algebra. For A, B,C € R?, note that (B A C)(L(A),-) is also a vector in R? (here L(A) 
is the metric dual of A; cf Sect. 2.7). Evaluate this vector in two ways: using the definition 


27 To map the components of e; A e; to a matrix you’ll need the convention discussed in Box 2.4. 
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(3.66) of the wedge product, and using (3.75). This should yield the BAC-CAB rule, as well 
as (hopefully) some intuition for it. 


Now that we know how to identify bivectors and regular vectors, we must 
examine what it means for bivectors to transform “like” vectors under rotations, but 
without a sign change under inversion. On the face of things, it seems like bivectors 
should transform very differently from vectors; after all, a bivector is a (0, 2) tensor, 
and you can show that it has matrix transformation law 


læ]s = Aļa]g A’. (3.79) 


This looks very different from the transformation law for the associated vector J (œ), 
which is just [cf. (3.24a)] 


[o] = AlJ(@)] a. (3.80) 


In particular, the bivector œ transforms with two copies of A, and the vector J (œ) 
with just one. How could these transformation laws be “the same”? Well, remember 
that œ isn’t any old (0,2) tensor, but an antisymmetric one, and for these a small 
miracle happens. This is best appreciated by considering an example. Let A be a 
rotation about the z axis by an angle 0, so that 


cos —sin@ 0 
A= J sin cos 0 
0 0 1 


Then you can check that, with œ’ = (J(a))!, 


[ale = Ala]g A" 


cos —sin0 0 0 -—a? a cos@ sind 0 
= | sind cos 0 a? 0 —a! — sin cos 0 
0 0 1 -—a a! 0 0 0 1 
0 —a a? cos ð + æ! sind 
= a? 0 a? sin 0 — æ! cos 0 (3.81a) 
—a?cos6—a!sin@ —a? sind +a! cos 0 0 
P —a? sin 0 + a! cos ð 
> æ? cos 6 + a! sind (3.81b) 


a 


which is exactly a rotation of the components a!’ of J (œ)! This seems to suggest that 
if we transform the components of œ € A?R? by a rotation first and then apply J, or 
apply J and then rotate the components, we get the same thing. In other words, the 


3.10 Pseudovectors 99 


map J commutes with rotations, and that is what it means for both bivectors 
and vectors to behave “the same” under rotations. 


Exercise 3.34. Derive (3.79). You may need to consult Sect. 3.2. Also, Verify (3.81a) by 
performing the necessary matrix multiplication. 


To prove that J commutes with arbitrary rotations (the example above just 
proved it for rotations about the z-axis), we need to show that 


7 


4 1 / fd 
Aj a! = xf es Ar Ava wae (3.82) 
where A is a rotation. On the left-hand side J is applied first followed by a rotation, 
and on the right-hand side the rotation is done first, followed by J. 

We now compute: 


1 yo 1 “yoy yo 
f al = i'p ak’ gl’ „mn 
zE Yih A, a = zever s Ai, Aa 


1 sf i f td 
=) il gp’ ak’ Al „mn 
= 7 Eper Ag AG A, A), & 
q 


1 ; 
=; X €gmn| Al At or” 
q 


= FlAletnn Ai Ai a™” 


= |A]A} af, (3.83) 


where in the second equality we used a variant of (3.26) which comes from writing 
out AAT = I in components, in the third equality we used the easily verified fact 


that €, un A? AK AU = |Alégmn, and in the fourth equality we raised an index to 
resume the use of Einstein summation convention and were able to do so because 
covariant and contravariant components are equal in orthonormal bases. Now, for 
rotations |A| = 1 so in this case (3.83) and (3.82) are identical and J(a) does 
transform like a vector. For inversion, however, |A| = | — Z| = —1 so (3.83) tells 
us that the components of J(œ) do not change sign under inversion, as those of an 
ordinary vector would. Another way to see this is to set A = —/ in (3.79) (check!). 

We have thus shown that pseudovectors are bivectors, since bivectors transform 
like vectors under rotation but don’t change sign under inversion. We have also 
seen that cross products are very naturally interpreted as bivectors. There are other 
pseudovectors lying around, though, that don’t naturally arise as cross products. For 
instance, what about the angular velocity vector œ? 


Example 3.30. The Angular Velocity Vector 


The angular velocity vector œ is usually introduced in the context of rigid body 
rotations. One usually fixes the center of mass of the body, and then the velocity v 
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Fig. 3.9 Our rigid body with 
fixed space frame K’ and 
body frame gray K in gray 


of a point of the rigid body is given by 
V=oxr, (3.84) 


where r is the position vector of the point as measured from the center of mass. The 
derivation of this equation usually involves consideration of an angle and axis of 
rotation, and from these considerations one can argue that w is a pseudovector. Here 
we will take a different approach in which œ will appear first as an antisymmetric 
matrix, making the bivector nature of œ manifest. 

Let K and K’ be two orthonormal bases for R? as in Example 2.12, with K 
time-dependent. One should think of K as being attached to the rotating rigid body, 
whereas K’ is fixed. We’ll refer to K as the body frame and K’ as the space frame. 
Both frames have their origin at the center of mass of the rigid body. This is depicted 
in Fig. 3.9. 

Now let r represent a point of the rigid body; then, [r]x will be its coordinates 
in the body frame, and [r]x, its coordinates in the space frame. Let A be the (time- 
dependent) orthogonal matrix of the basis transformation taking K’ to K, so that 


Ir]x = Alr]x. (3.85) 


We’d now like to calculate the velocity [v] x; of our point in the rigid body relative to 
the space frame, and compare it to (3.84). This is given by just differentiating [r] x: : 


Me = Che 
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d 
= Alle by 3.85) 
dA 


=a [rlx since [r]x is constant 
dA _ 
=r a ‘Trlr. (3.86) 


So far this doesn’t really look like (3.84). What to do? First, observe that aA} 
a4 AT is actually an antisymmetric matrix: 


d 
=- (I 
0= 7) 
d E 
= g4") 
dA yr 4g" 
~ dt d 
dA p (dA ar? 
= Agr y (Fa ) l (3.87) 


We can then define an angular velocity bivector & whose components in the space 


frame are given by 


Then we simply define the angular velocity vector w to be 
w = J(@). 


Note that w is, in general, time-dependent. It follows from this definition that 


3 2 
~w @ 
dA 
L AT! 2 3/ 0 —o" 
dt o o o0 


/ / f: 

dA 0 =o? w? x! 
— / ra z: 
— Ará =| o” 0 —o! z 
dt y 1? 3/ 
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lo xr] . 
Combining this with (3.86), we then have 
Mle = [@ x r] (3.88) 


which is just (3.84) written in the space frame. Thus, 


The “pseudovector” w in the space frame is nothing more than the vector 
associated with the antisymmetric matrix 44 A~'! 


Exercise 3.35. Use (3.79) to show that the bivector @ in the body frame is 


dA 
D = AUE 
[alx dt 


Combine this with (3.86) to show that (3.84) is true in the body frame as well. 


In the last example we saw that to any time-dependent rotation matrix A we 
could associate an antisymmetric matrix A Am, which we can identify with the 
angular velocity vector which represents “infinitesimal” rotations. This association 
between finite transformations and their infinitesimal versions, which in the case of 
rotations takes us from orthogonal matrices to antisymmetric matrices, is precisely 
the relationship between a Lie group and its Lie algebra. We turn our attention to 
these objects in the next part of this book. 


Chapter 3 Problems 


Note: Problems marked with an “x” tend to be longer, and/or more difficult, and/or 
more geared towards the completion of proofs and the tying up of loose ends. 
Though these problems are still worthwhile, they can be skipped on a first reading. 


3-1. In this problem we explore the properties of n x n orthogonal matrices. 
This is the set of real invertible matrices A satisfying A’ = A, and is 
denoted O(n). 


(a) Is O(n) a vector subspace of M, (R) ? 

(b) Show that the product of two orthogonal matrices is again orthogonal, 
that the inverse of an orthogonal matrix is again orthogonal, and that 
the identity matrix is orthogonal. These properties show that O(n) is 
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a group, i.e. a set with an associative multiplication operation and 
identity element such that the set is closed under multiplication and 
every element has a multiplicative inverse. Groups are the subject of 
Chap. 4. 

Show that the columns of an orthogonal matrix A, viewed as vectors 
in R”, are mutually orthogonal under the usual inner product. Show the 
same for the rows. Show that for an active transformation, i.e. 


(c 


wm 


[eile = Aleils. 
where B = {e;};=1,.., So that 


+ 
[eH (0,...,_1_,...,0), 
ith slot 


the columns of A are the [e;’]g. In other words, the components of the 
new basis vectors in the old basis are just the columns of A. This also 
shows that for a passive transformation, where 


leila = Aleile 


the columns of A are the components of the old basis vectors in the new 
basis. 

Show that the orthogonal matrices A with |A| = 1, the rotations, 
form a subgroup unto themselves, denoted SO(n). Do the matrices with 
|A| = —1 also form a subgroup? 


(d 


wm 


3-2. In this problem we’ll compute the dimension of the space of (0,7) 
symmetric tensors S’(V). This is slightly more difficult to compute than 
the dimension of the space of (0,7) antisymmetric tensors A’V, which 
was Exercise 3.30. 


(a) Let dim V = n and {e;};=1.., be a basis for V. Argue that dim S” (V) 
is given by the number of ways you can choose r (possibly repeated) 
vectors from the basis {e;};=1...n- 

(b) We’ve now reduced the problem to a combinatorics problem: how many 
ways can you choose r objects from a set of objects, where any object 
can be chosen more than once? The answer is 


= —___, (3.89) 


n—1 r!(n— 1)! 


dim S” (V) = (T) _ (n+r-!)! 


Try to derive this on your own. If you need help, the solution to this is 
known as the “stars and bars” or “balls and walls” method; you can also 
refer to Sternberg [19], Chap. 5. 
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3-3. Prove the following basic properties of the determinant directly from 
the definition (3.72). We will restrict our discussion to operations with 
columns, though it can be shown that all the corresponding statements for 
rows are true as well. 


(a) Any matrix with a column of zeros has |A| = 0. 

(b) Multiplying a column by a scalar c multiplies the whole determinant 
byc. 

(c) The determinant changes sign under interchange of any two columns. 

(d) Adding two columns together, i.e. sending A; — A; + A; for any i and 
j, doesn’t change the value of the determinant. 


3-4. One can extend the definition of determinants from matrices to more 
general linear operators as follows: We know that a linear operator T on 
a vector space V (equipped with an inner product and orthonormal basis 
{e;}i=1..n) can be extended to an operator on the p-fold tensor product 


T, (V) by 
T(v 8-8 vp) = (Tv) @--- @ (Tvp) 


and thus, since A"V C 7); the action of T extends to A” V similarly 
by 


T(vy A+++ A Un) = (Tv) A+++ A (Tvn). 


Consider then the action of T on the contravariant version of e, the 
tensor č = e] A++- Ae, . We know from Exercise 3.30 that A” V is one- 
dimensional, so that T(€) = (Te,) A--- A (Ten) is proportional to č. We 
then define the determinant of T to be this proportionality constant, so that 


(Tei) A A (Ten) = |The, Ave A en. (3.90) 


(a) Show by expanding the left-hand side of (3.90) in components that this 
more general definition reduces to the old one of (3.73) in the case of 
V = R”. 

(b) Use this definition of the determinant to show that for two linear 
operators B and C on V, 


|BC| = |B||C]. 


In particular, this result holds when B and C are square matrices. 

(c) Use (b) to show that the determinant of a matrix is invariant under 
similarity transformations (see Example 3.8). Conclude that we could 
have defined the determinant of a linear operator T as the determinant 
of its matrix in any basis. 


Chapter 3 Problems 


3-5. Let V be a vector space with an inner product and orthonormal basis 
{e;}i=1..n- Prove that a linear operator T is invertible if and only if |T| 4 0, 
as follows: 


(a) Show that T is invertible if and only if {T(e;)}i=1... is a linearly 
independent set (see Exercise 2.9 for the “if” part of the statement). 

(b) Show that |T| 4 0 if and only if {T (e;)};=1..» is a linearly independent 
set. 

(c) This is not a problem, just a comment. In Example 3.28 we interpreted 
the determinant of a matrix A as the oriented volume of the n-cube 
determined by {Ae;}. As you just showed, if A is not invertible then the 
Ae; are linearly dependent, hence span a space of dimension less than 
n and thus yield an n-dimensional volume of 0. Thus, the geometrical 
picture is consistent with the results you just obtained! 


3-6. Let B be the standard basis for R”, O the set of all bases related to B by a 
basis transformation with |A| > 0, and O’ the set of all bases related to B 
by a transformation with |A| < 0. 


(a) Using what we’ve learned in the preceding problems, show that a basis 
transformation matrix A cannot have |A| = 0. 

(b) O is by definition an orientation. Show that O’ is also an orientation, 
and conclude that R” has exactly two orientations. Note that both O and 
O’ contain orthonormal and non-orthonormal bases. 

(c) For what n is A = —/ an orientation-changing transformation? 
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Part II 
Group Theory 


Chapter 4 
Groups, Lie Groups, and Lie Algebras 


In physics we are often interested in how a particular object behaves under a 
particular set of transformations; for instance, one often reads that under rotations 
a dipole moment transforms like a vector, and a quadropole moment like a (second 
rank) tensor. Electric and magnetic fields supposedly transform like independent 
vectors under rotations, but collectively like a second rank antisymmetric tensor 
under Lorentz transformations. Similarly, in quantum mechanics one is often 
interested in the “spin” of a ket (which specifies how it transforms under rotations), 
or its behavior under the time-reversal or space inversion (parity) transformations. 
This knowledge is particularly useful as it leads to the many famous “selection 
rules” which greatly simplify evaluation of matrix elements. Transformations are 
also crucial in quantum mechanics because, as we’ll see, all physical observables 
can be considered as “infinitesimal generators” of particular transformations; for 
example, the angular momentum operators “generate” rotations (as we discussed 
briefly in Problem 2-4) and the momentum operator “generates” translations. 

Like tensors, this material is usually treated in a somewhat ad-hoc way, which 
facilitates computation but obscures the underlying mathematical structures. These 
underlying structures are known to mathematicians as group theory, Lie theory, 
and representation theory, and are known collectively to physicists as just “group 
theory”. In this second half of the book we’ll present the basic facts of this theory, 
along with many physical applications. As in Part I, the aim is to clarify and unify 
the diverse phenomena in physics that this mathematics underlies. Furthermore, the 
mathematics of group theory is in some senses a natural extension and application 
of what we learned in Part I. 

Before we discuss how particular objects transform, however, we must discuss 
the transformations themselves, in both their “‘finite” and “infinitesimal” form. That 
discussion is the subject of the present chapter. As with Part I, we begin with a 
heuristic introduction which hopefully conveys some of the essential points, as well 
as motivates the precise discussion that follows. 
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4.1 Invitation: Lie Groups and Infinitesimal Generators 


In studying physics you may have encountered the terms “group” and “Lie 
group”, and if you’ve studied quantum mechanics you most likely have heard of 
“infinitesimal generators”. These terms can sound quite mysterious, and physics 
students are often left with the sense that something profound but just out of reach 
is going on underneath the routine homework problems. Our goal in this section 
is to demystify these subjects by introducing their basic ideas, along with familiar 
examples. 

We begin with groups. Though group theory is a vast, profound, and often 
abstract subject, the basic idea of a group is simple: it is just a set of transformations 
that are composable and invertible. We’ll define this more precisely in the next 
section, but for now we’ll illustrate what that means with an example. 


Example 4.1. 2-D Rotations 


Consider a position vector r = (x, y) in the plane, and rotate it counterclockwise 
by an angle 9, as illustrated in Fig. 4.1. Then the rotated vector r’ has coordinates 


x’ = xcos@ — y sin ð 


t 


y = xsin + ycosé. 


This transformation can be re-written in terms of matrix multiplication as 


x'\ _ (cos@ —sin@\ (x 
y!) ~~ \sin@ cos y)” 


The 2 x 2 matrix 


_ (cos —siné 
= Cc cos 6 ) D 
y 
r’ lb r=(x,y) 
T 


Fig. 4.1 An arbitrary vector 
r in the plane, along with its 
rotated version r’ 
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is what we mean by a “rotation”; it’s what a rotation is. We can then consider the set 
of all such 2-D rotations, which we’ll denote! SO(2): 


SOQ) = e | | be 0.2m} 


sinô cos 


It should be clear from our geometric picture of rotations in Fig. 4.1 that composing 
two rotations yields a third rotation; you can also check analytically that 


R(0)- R(d) = R(0 + @). (4.2) 


Thus, rotations are composable. Or, put another way, SO (2) is closed under matrix 
multiplication. 
It’s hopefully also clear from our picture of rotations that 


R(@)' = R(-6), (4.3) 
and this is also easily checked analytically. Thus, every rotation has an inverse that is 


also a rotation, so rotations are invertible. Thus, rotations are a set of transformations 
that are both composable and invertible, and that makes SO(2) a group. E 


Exercise 4.1. Use (4.1) to verify (4.2) and (4.3). 


Note that R(@) from (4.1) is completely and uniquely determined by the angle 
0; we then say that SO(2) is parameterized by 0. Any such group, which can be 
smoothly parameterized by one or more continuous variables, is known as a Lie 
group. Lie groups stand in contrast to discrete groups, which don’t accommodate 
such a parameterization. (We will meet examples of these in the next section.) 
Also, note that SO(2) is a group composed of matrices. Thus, SO(2) is a matrix 
Lie group. For many applications in physics this particular class of groups is the 
most important, and we will spend much time in Part II studying these groups. 

Another mysterious concept that one encounters in physics is that of an “infinites- 
imal transformation” or “infinitesimal generator”. This concept only applies to 
Lie groups, since a smooth, continuous parameterization is required to make a 
transformation “infinitesimal”. As with groups, it is possible to give infinitesimal 
generators a one-sentence description: they are just derivatives of Lie group 
elements with respect to their parameters. We again illustrate with some examples. 


Example 4.2. Infinitesimal generators of 2-D rotations 


If an infinitesimal generator is just a derivative, then let’s differentiate R(0) 
from (4.1) and evaluate at 0 = 0: 


‘This notation may be familiar if you did Problem 3-1 d). If not, it will be explained in the next 
section. 
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Fig. 4.2 The dashed line is y 
the parameterized curve r(0) n 
from (4.4). The tangent 
vector ae at 0 = 0 is just 
given by Xro 


dR 
dé 


0-1 
9=0 1 0 
The matrix X is called a “generator” because it generates rotations, in the following 
sense. Consider the vector rọ = (1, 0) in the plane, as well as the curve 


r(0) = R(0) -ro (4.4) 


traced out by the vector as it rotates; see Fig. 4.2. Then the tangent vector to this 


curve at 0 = 0 is given by 
0-1 1 0 
warm (To) a 
g=0 1 0 0 1 


We can thus interpret X as the matrix which turns ro into a lang it takes in a 


position vector and produces the direction in which the position vector will change 
as a rotation is applied. This is the sense in which X “generates” rotations, and is 
also illustrated in Fig. 4.2. 


dr 
dé 


_ dR 
o=o 40 


Example 4.3. Infinitesimal generators of 3-D rotations 


In three-dimensions we now have three different axes about which we can rotate. 
The matrices corresponding to rotations about the x, y, and z axes are analogous 
to (4.1) and are given, respectively, by: 


1 0 0 
R,(0) = | Ocos@ —siné 
O sinô cosé 
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cos@ 0 sin@ 
R,(0)={ 0 1.0 

—sin 0 0 cos 0 

cos@ —sin0 0 
R(@) = | sin cosé 0 


0 0 1 


Note that R,.(@) leaves the x-axis invariant, as a rotation about the x-axis should. 
Similarly for R, (0) and R,(@). The corresponding generators are 


pani = 
ae igg 010 
0 01 
L,=| 000 
-100 
0-10 
L-={10 0]. (4.5) 
000 


(You may recognize L, from (2.23), up to a factor of i. The appearance here of L; 
and the discrepancy of a factor of i will be explained in Chap. 5). Note that 


Span{L,, Ly, L,} = {anti-symmetric 3 x 3 matrices} = s0(3) 


where s0(3) denotes the vector space of 3 x 3 antisymmetric matrices.” Note that 
we have again uncovered a close connection between rotations and antisymmetric 
matrices, just as we found at the very end of Part I in Example 3.30. 

If you read that example, then much of the following will be familiar. It turns 
out that any element of s0(3), and not just Lx, Ly, or Lz, can be interpreted as a 
generator of a rotation. If œ € so(3), then w is antisymmetric and can be written as 


0 —a, wy 
w=| a, 0 —-a, 
—0y ox 0 


?The notation so(3) may seem like it has a deeper meaning, and it does, but we can’t explain 
that until Sect. 4.6. Until then, just consider s0(3) as odd notation for the vector space of 3 x 3 
antisymmetric matrices. 
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(The reason for the strange labeling of the components of w will be clear in a 
moment.) Now, œw generates rotations just as the matrix X from the last example 
does: if we matrix multiply a position vector r = (x, y, z) by w, we get the direction 
in which r is changing as the rotation is applied. To figure out what rotation that is, 
you can check that: 


0 —a, Wy x 
or=| œ; 0 —a, y 
-0y ax 0 z 
WyZ— WY 
= | xX — Oxz 
WxY — WyX 
=oxr (4.6) 


where 
@ = (Wx, Wy, Wz) 


is a “vector” (really a pseudovector; see Sect. 3.10) associated with the antisymmet- 
ric matrix w. Now, carefully distinguishing w the matrix from æ the vector, we set 
r = @ in (4.6) to get 


OO =0@0 Xw =l. 


This means that œ is unchanged by the rotation, so must lie along the axis of 
rotation. Furthermore, the expression (4.6) is just the expression from classical 
mechanics for the velocity of a point r on a rigid body rotating with angular velocity 
vector w. We can thus conclude that 


The generator w € so(3) generates a rotation along the w axis, where the 
pseudovector w is the familiar angular velocity vector. 


To conclude this example, we note that the matrices Ly, Ly, and L, have 
interesting interrelationships via the commutator. For instance: 


[Lx, Ly] = L,Ly = Ly Ly 


00 0 0 01 001 00 0 
={|00-1 0 00]-] 0 00 00-1 
01 0 —1 00 —1 00 01 0 
000 010 
={100]—-| 000 


000 000 
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ll 
or © 
oo! 


= L! (4.7) 
Similarly, you can check that 


[Ly, L] = Lx 


(4.8) 
[Lz Lx] = Ly. 

Taken together, (4.7) and (4.8) are, of course, just the familiar angular momentum 
commutation relations from quantum mechanics. Note that they are derived here 
not in the abstract setting of Hermitian operators on a Hilbert space, but rather 
from simple 3 x 3 matrices that generate rotations! Furthermore, (4.7) and (4.8) 
tell us that so(3) is a vector space which is closed under commutators, meaning that 
a commutator of two s0(3) elements yields another s0(3) element. Such a vector 
space is known as a Lie algebra, and we have thus seen here that: 


The vector space of infinitesimal generators of a matrix Lie group forms 
a Lie algebra. 


Matrix Lie groups and their associated Lie algebras will be the central (but not 
exclusive) focus of Part II of this book. 


Exercise 4.2. Verify (4.8). 


4.2 Groups: Definition and Examples 


Now that we have a little bit of a feel for what groups are, it is time to give 
their precise definition. As with vector spaces, we will give an abstract, axiomatic 
definition, where the axioms are just meant to embody the most important properties 
of sets of transformations, including the composability and invertibility we just 
discussed. As we proceed, you may find it helpful to carry the examples of 2-D and 
3-D rotations from the previous section in the back of your mind. However, as with 
vector spaces, the utility in taking the abstract, axiomatic approach is that what we 
learn about abstract groups will then apply to objects that look nothing like rotations 
in Euclidean space, so long as those objects satisfy the axioms. After laying down 
the axioms and establishing some basic properties of groups, we’ll proceed directly 
to concrete examples. 

That said, a group is a set G together with a “multiplication” operation, denoted 
-, that satisfies the following axioms: 


1. (Closure) g, h € G implies g -h € G. 
2. (Associativity) For g, h,k € G, g - (h- k) =(g-h)-k. 
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3. (Existence of the identity) There exists an element e € G such that g-e = 
e-g=g VgeG. 

4. (Existence of inverses) V g € G there exists an element h € G such that g-h = 
h-g=e. 


If we think of a group as a set of transformations, as we usually do in physics, 
then the multiplication operation is obviously just composition; that is, if R and S 
are three-dimensional rotations, for instance, then R- S is just S followed by R. 
Note that we don’t necessarily have R - S = S- R for all rotations R and S; in 
cases such as this, G is said to be non-commutative (or non-abelian). If we did 
have S - R= R - S forall R, S € G, then we would say that G is commutative 
(or abelian). 

There are several important properties of groups that follow almost immediately 
from the definition. Firstly, the identity is unique, for if e and f are both elements 
satisfying axiom 3 then we have 


e =e- f since f is an identity 


f since e is an identity. 


Secondly, inverses are unique: Let g € G and let h and k both be inverses of g. 
Then 


g-h=e 
so multiplying both sides on the left by k gives 
k-(g-h) =k, 
(k-g)-h=k by associativity, 


e-h = k since k is an inverse of g, 


h=k. 


We henceforth denote the unique inverse of an element g as g™!. 


Thirdly, if g € G and h is merely a right inverse for g, i.e. 
g-h=e, (4.9) 


then h is also a left inverse for g and is hence the unique inverse g~!. This is seen 
as follows: 


h-g = (g-'-g)-(h-g) 
= (g7! - (g -h)) -g by associativity 
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—1 


=(g-e)-g by (4.9) 

mere 

= E; 

soh = g™!. 
The last few properties concern inverses and can be verified immediately: 
(=g 
(gh = hg 
e! =e. 


Exercise 4.3. Prove the cancellation laws for groups, i.e. that 


g&o h= g ih = gi 5g 
h- gı = h: gy => gı = &. 


Before we get to some examples, we should note that axioms 2 and 3 in the 
definition above are usually obviously satisfied and one rarely needs to check them 
explicitly. The important thing in showing that a set is a group is verifying that 
it is closed under multiplication and contains all its inverses. Also, as a matter of 
notation, from now on we will usually omit the - when writing a product, and simply 
write gh for g-h. 


Example 4.4. R : The real numbers as an additive group 


Consider the real numbers R with the group “multiplication” operation given by 
regular addition, i.e. 


x-y=xt+y x,vyeER. 


It may seem counterintuitive to define “multiplication” as addition, but the definition 
of a group is rather abstract so there is nothing that prevents us from doing this, 
and this point of view will turn out to be useful. With addition as the product, R 
becomes an abelian group: The first axiom to verify is closure, and this is satisfied 
since the sum of two real numbers is always a real number. The associativity axiom 
is also satisfied, since it is a fundamental property of real numbers that addition is 
associative. The third axiom, dictating the existence of the identity, is satisfied since 
0 € R fits the bill. The fourth axiom, which dictates the existence of inverses, is 
satisfied since for any x € R, —x is its (additive) inverse. Thus R is a group under 
addition, and is in fact an abelian group since x + y= y +x Yx,y ER. 

Note that R is not a group under regular multiplication, since 0 has no multi- 
plicative inverse. If we remove 0, though, then we do get a group, the multiplicative 
group of nonzero real numbers, denoted R*. We leave it to you to verify that R* is a 
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group. You should also verify that this entire discussion goes through for C as well, 
so that C under addition and C* = C\{0} under multiplication are both abelian 
groups. 


Example 4.5. Vector spaces as additive groups 


The previous example of R and C as additive groups can be generalized to the 
case of vector spaces, which are abelian groups under vector space addition. The 
group axioms follow directly from the vector space axioms, as you should check, 
with 0 as the identity. While viewing vector spaces as additive groups means we 
ignore the crucial feature of scalar multiplication, we’ ll see that this perspective will 
occasionally prove useful. 


Example 4.6. GL(V), GL(n, R), and GL(n,C): The general linear groups 


The general linear group of a vector space V, denoted GL(V), is defined to be 
the subset of L(V) consisting of all invertible linear operators on V. We can easily 
verify that GL(V) is a group: to verify closure, note that for any T,U € GL(V), 
TU is linear and (TU)~! = U'T™!, so TU is invertible. To verify associativity, 
note that for any T, U, V € GL(V) and v € V, we have 


(TUV) O) = TUVO) = (TU)V)(v) 


(careful unraveling the meaning of the parentheses!) so that T(UV) = (TU)V.To 
verify the existence of the identity, just note that 7 is invertible and linear, hence 
in GL(V). To verify the existence of inverses, note that for any T € GL(V), T7! 
exists and is invertible and linear, hence is in GL(V ) also. Thus GL(V) is a group. 

Let V have scalar field C and dimension n. If we pick a basis for V, then 
for each T € GL(V) we get an invertible matrix [T] € M,,(C). Just as all the 
invertible T € L(V) form a group, so do the corresponding invertible matrices in 
M,,(C); this group is denoted as GL(n, C), and the group axioms can be readily 
verified for it.’ When C = R we get GL(n, R), the real general linear group in n 
dimensions, and when C = C we get GL(n, C), the complex general linear group 
in n dimensions. E 


While neither GL(V), GL(n, R), nor GL(n, C) occur explicitly very often in 
physics, they have many important subgroups, i.e. subsets which themselves are 
groups. The most important of these arise when we have a vector space V equipped 
with a non-degenerate Hermitian form (-| -). In this case, we can consider the set of 
isometries Isom(V), consisting of those operators T which “preserve” (-|-) in the 


sense that 
(Tv|Tw) = (v|w) Vu,w,e V. (4.10) 


3You may recall having met GL(n, R) at the end of Sect. 2.1. There we asked why it isn’t a vector 
space, and now we know—it’s more properly thought of as a group! 
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If (-|-) can be interpreted as giving the “length” of vectors, then an isometry T 
can be thought of as an operator that preserves lengths.* Note that any such T is 
invertible (why?), hence Isom(V) C GL(V). Isom(V) is in fact a subgroup of 
GL(V), as we’ll now verify. First off, for any T, U € Isom(V), 


((TU)v|(TU)w) = (TU v)|T(Uw)) 
= (Uv|Uw) since T is an isometry 


= (v|w) since U is an isometry 


so TU is an isometry as well, hence Isom(V) is closed under multiplication. As 
for associativity, this axiom is automatically satisfied since IsSom(V) C GL(V) and 
multiplication in GL(V) is associative, as we proved above. As for the existence of 
the identity, just note that the identity operator / is trivially an isometry. To verify 
the existence of inverses, note that T7! exists and is an isometry since 


(T~!v|T“!w) = (TT! v|TT“|w) since T € Isom(V) 


= (v|w). 


Thus Isom(V) is a group. Why is it of interest? Well, as we’ll show in the 
next few examples, the matrix representations of Isom(V) actually turn out to be 
the orthogonal matrices, the unitary matrices, and the Lorentz transformations, 
depending on whether or not V is real or complex and whether or not (-|-) is 
positive-definite. Our discussion here shows? that all of these sets of matrices are 
groups, and that they can all be thought of as representing linear operators which 
preserve the relevant non-degenerate Hermitian form. 


Example 4.7. The orthogonal group O(n) 


Let V be an n-dimensional real inner product space. The isometries of V can be 
thought of as operators which preserve lengths and angles, since the formula 


for the angle between v and w is defined purely in terms of the inner product. Now, 
if T is an isometry and we write out (v, w) = (T'v|Tw) in components referred to 
an orthonormal basis $, we find that 

[v] fw] = wlw) 


= (Tv|Tw) 


4Hence the term “iso-metry” = “same length.” 


>We are glossing over some subtleties with this claim. See Example 4.20 for the full story. 
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= ôy (Tv)! (Tw)! 
= ô Ti vT w! 
= D vfTİ Ti w 
= [v] [T] [T]iw] Yv,w,e€ V. (4.11) 
As you will show in Exercise 4.4, this is true if and only if [7]’ [T] = J, or 
(ey Sir. (4.12) 


This is the familiar orthogonality condition, and the set of all orthogonal matrices 
(which, as we know from the above discussion and from Problem 3-1, form a group) 
is known as the orthogonal group O(n). We will consider O(n) in detail for n = 
2,3 in the next section. 


Exercise 4.4. Show that 
[v] [w] = [ol [T] [Tw] Vu.w eV 


if and only if [7]? [T] = Z. One direction is easy; for the other, let v = e;, w = e; where 
{ei }i=1..n is orthonormal. 


Exercise 4.5. Verify directly that O(n) is a group, by using the defining condition (4.12). 
This is the same as Problem 3-1 b. 


Example 4.8. The unitary group U(n) 


Now let V be a complex inner product space. Recall from Problem 2-7 that we can 
define the adjoint TÝ of a linear operator T by the equation 


(Ttv|w) = (v|Tw). (4.13) 


If T is an isometry, then we can characterize it in terms of its adjoint, as follows: 
first, we have 


(viw) = (Tv|Tw) 
=(T'Tv|w) Vu,we V. 


Calculations identical to those of Exercise 4.4 then show that this can be true if and 
only if TT = TT* = I, which is equivalent to the more familiar condition 


(4.14) 


Such an operator is said to be unitary. Thus every isometry of a complex inner 
product space is unitary, and vice-versa. Now, if V has dimension n and we choose 
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an orthonormal basis for V, then we have [Tİ] = [T]' (cf. Problems 2-7), and 
so (4.14) implies 


[T]t = [TT'. (4.15) 


Thus, in an orthonormal basis, a unitary operator is represented by a unitary 
matrix! (Note that this is NOT necessarily true in a non-orthonormal basis.) By 
the discussion preceding Example 4.7, the set of all unitary matrices forms a group, 
denoted U (n). We won’t discuss U (n) in depth in this text, but we will discuss one 
of its cousins, SU(2) (to be defined below), extensively. 

Note that there is nothing in the above discussion that requires V to be complex, 
so we can actually use the same definitions (of adjoints and unitarity) to define 
unitary operators on any inner product space, real or complex. Thus, a unitary 
operator is just an isometry of a real or complex inner product space. In the case of 
a real vector space, the unitary matrix condition (4.15) reduces to the orthogonality 
condition (4.12), as you might expect. 


Exercise 4.6. Verify directly that U(n) is a group, using the defining condition (4.15). 
Example 4.9. The Lorentz group O(n — 1,1) 


Now let V be a real vector space with a Minkowski metric n, which is defined, as 
in Example 2.20, as a symmetric, non-degenerate (2,0) tensor whose matrix in an 
orthonormal basis has the form 


=] (4.16) 
=| 


with zeros on all the off-diagonals. This is to be compared with (2.33), which is 
just (4.16) with n = 4. Now, since 7 is a non-degenerate Hermitian form, we can 
consider its group of isometries. If T € Isom(V), then in analogy to the computation 
leading to (4.11), we have (in an arbitrary basis 8), 


[v] [niw] = nw, w) 


= n(Tv, Tw) 
= [W] [TI MT] Vu.we V. (4.17) 


In fact, the only reason for speaking of both “isometries” and “unitary operators” is that 
unitary operators act solely on inner product spaces, whereas isometries can act on spaces with 
non-degenerate Hermitian forms that are not necessarily positive-definite, such as R* with the 
Minkoswki metric. 
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Again, the same argument as you used in Exercise 4.4 shows that the above holds if 
and only if 


[T] MIT] = [n] (4.18) 
which in components reads 
T ’T, Noo = Nuw- (4.19) 


If B is orthonormal, (4.18) becomes 


mi) = [m= 


which you will recognize from (3.29) as the definition of a Lorentz transformation, 
though now we are working in an arbitrary dimension n rather than just dimension 
four. We thus see that the set of all Lorentz transformations forms a group, known 
as the Lorentz group and denoted by O(n — 1,1) [the notation just refers to the 
number of positive and negative 1’°s present in the matrix form of 7 given in (4.16)]. 
The Lorentz transformations lie at the heart of special relativity, and we will take a 
close look at these matrices for n = 4 in the next section. E 


Exercise 4.7. Verify directly that O(n — 1, 1) is group, using the defining condition (4.18). 


Box 4.1 Active and Passive Interpretations of Isometries 

You may recall that we originally defined orthogonal matrices, unitary matri- 
ces, and Lorentz transformations as those matrices which implement a basis 
change from one orthonormal basis to another (on vector spaces with real 
inner products, Hermitian inner products, and Minkowski metrics, respec- 
tively). In the preceding examples, however, we’ve seen that these matrices 
can alternatively be defined as those which represent (in an orthonormal 
basis) operators which preserve a non-degenerate Hermitian form. These two 
definitions correspond to the active and passive viewpoints of transformations: 
our first definition of these matrices (as those which implement orthonormal 
basis changes) gives the passive viewpoint, while the second definition (as 
those matrices which represent isometries) gives the active viewpoint. 


Example 4.10. The special unitary and orthogonal groups SU (n) and SO(n) 


The groups O(n) and U (n) have some very important subgroups, the special unitary 
and special orthogonal groups, denoted SU(n) and SO(n) respectively, which are 
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defined as those matrices in U(n) and O(n) that have determinant equal to 1. You 
will verify below that these are subgroups of U (n) and O(n). These groups are basic 
in mathematics and (for certain n) fundamental in physics: as we’ll see, SO(n) is 
the group of rotations in n dimensions, SU(2) is crucial in the theory of angular 
momentum in quantum mechanics, and (though we won’t discuss it here) SU(3) 
is fundamental in particle physics, especially in the mathematical description of 
quarks. Oo 


Exercise 4.8. Show that SO(n) and SU(n) are subgroups of O(n) and U(n). 


Before moving on to a more detailed look at some specific instances of the groups 
described above, we switch gears for a moment and consider groups that aren’t 
subsets of GL(n,C). These groups have a very different flavor than the groups 
we’ve been considering, but are useful in physics nonetheless. We’ll make more 
precise the sense in which they differ from the previous examples when we get to 
Sect. 4.5. 


Example 4.11. Z, The group with two elements 


Consider the set Z2 = {+1,—1} C Z with the product being just the usual 
multiplication of integers. You can easily check that this is a group, in fact an abelian 
group. Though this group may seem trivial and somewhat abstract, it pops up in a 
few places in physics, as we’ll see in Sect. 4.4. 


Example 4.12. S„ The symmetric group on n letters 


This group does not usually occur explicitly in physics but is intimately tied 
to permutation symmetry, the physics of identical particles, and much of the 
mathematics we discussed in Sect. 3.8. The symmetric group on n letters (also 
known as the permutation group), denoted S,, is defined to be the set of all one- 
to-one and onto maps of the set {1,2,...,”} to itself, where the product is just the 
composition of maps. The maps are known as permutations. You should check that 
any composition of permutations is again a permutation and that permutations are 
invertible, so that S,, is a group. This verification is simple, and just relies on the 
fact that permutations are, by definition, one-to-one and onto. 

Any permutation o is specified by the n numbers o(i), i = 1..., and can 


conveniently be notated as 
1 2s) n 
a(l) o(2) --- o(n) j` 


In such a scheme, the identity in S3 would just look like 


123 
123 


while the cyclic permutation 0; given by 1 — 2, 2 — 3, 3 — 1 would look like 
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salo 
ane oe oes 


The transposition o2 which switches 1 and 2 and leaves 3 alone would look like 


How do we take products of permutations? Well, the product o1 - o2 would take on 
the following values: 


(0) -02)(1) = o (02(1)) = 01 (2) = 3 
(a1 -02)(2) = 01 (1) = 2 (4.20) 
(01 - 02)(3) = 01 (3) = 1 


123\ /123 123 
Penn l = l 4.21 
on a) Gee a Cha 


You should take the time to inspect (4.21) and understand how to take such a product 
of permutations without having to write out (4.20). 

Though a proper discussion of the applications of S,, to physics must wait until 
Sect. 4.4, we point out here that if we have a vector space V and consider its n-fold 
tensor product 7,°(V), then S, acts on product states by 


so we have 


olv Dug... @ Vn) = Vo(1) ® Vs) @...® Vo(n): 


(Note that this generalizes the permutation operator introduced in Example 3.26.) 
A generic element of 7,°(V) will be a sum of such product states, and the action 
ofo € S, on these more general states is determined by imposing the linearity 
condition. In the case of n identical particles in quantum mechanics, where the 
total Hilbert space is naively the n-fold tensor product 7? (H) of the single-particle 
Hilbert space H, this action effectively interchanges particles, and we will later 
restate the symmetrization postulate from Example 3.26 in terms of this action of 


S, on TP(H). 


Exercise 4.9. Show that S, has n! elements. 
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We are now ready for a detailed look at some of the specific groups which arise in 
physics. 


Example 4.13. SO(2) Special orthogonal group in two dimensions 


As discussed above, SO(2) is the group of all orthogonal 2 x 2 matrices with 
determinant equal to 1. You will check in Exercise 4.10 that SO(2) is abelian and 
that the general form of an element of SO(2) is 


ee 7) l (4.22) 


sinô cos 


You will recognize that such a matrix represents a counterclockwise rotation of 
0 radians in the x — y plane, as discussed in Example 4.1. Though we won’t 
discuss SO (2) very much, it serves as a nice warmup for the next example, which 
is ubiquitous in physics and will be discussed throughout the text. 


Exercise 4.10. Consider an arbitrary matrix 


and impose the orthogonality condition, as well as |A| = 1. Show that (4.22) is the most 
general solution to these constraints. Then, verify explicitly that SO(2) is a group (even 
though we already know it is by Exercise 4.8) by showing that the product of two matrices 
of the form (4.22) is again a matrix of the form (4.22). This will also show that SO(2) is 
abelian. 


Example 4.14. SO(3) Special orthogonal group in three-dimensions 


This group is of great importance in physics, as it is the group of all rotations in 
three-dimensional space! For that statement to mean anything, however, we must 
carefully define what a “rotation” is. One commonly used definition is the following: 


Definition. A rotation in n dimensions is any linear operator R which can be 
obtained continuously from the identity’ and takes orthonormal bases to orthonor- 
mal bases. This means that for any orthonormal basis {e;};=1..n , {Rei}i=1...n Must 
also be an orthonormal basis. 


You will show in Problem 4-1 that this definition is equivalent to saying R € SO(n). 

Given that SO(3) really is the group of three-dimensional rotations, then, can 
we find a general form for an element of SO(3)? As you may know from classical 
mechanics courses, an arbitrary rotation can be described in terms of the Euler 


™Meaning that there exists a continuous map y : [0,1] —> GL(n,R) such that y(0) = J and 
y(1) = R. In other words, there is a path of invertible matrices connecting R to T. 
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angles, which tell us how to rotate a given orthonormal basis into another of the 
same orientation (or handedness). In classical mechanics texts,’ it is shown that this 
can be achieved by rotating the given axes by an angle ¢ around the original z-axis, 
then by an angle 0 around the new x-axis, and finally by an angle y around the new 
z-axis. If we take the passive point of view, these three rotations take the form 


cosy siny 0 1 0 0 cosg@ sing 0 
—siny cosy 0 |, | 0 cos@ sind |, | —sing cos¢ 0 
0 0 1 0 — sin 0 cos 0 0 0 1 


so multiplying them together gives a general form for R € SO(3): 


cos Y cos ġ — cos 0 sino siny cos y sing + cos 0 coso siny sin y sind 
— sin Y cos ġ — cos 0 sing cos Y — sin y sing + cos 0 cos ġ cos y cos ¥ sind 
sin 0 sing — sin 0 cos ġ cos 0 
(4.23) 


Another general form for R € SO(3) is that of a rotation by an arbitrary angle 0 
about an arbitrary axis Ĥ; you will see in Sect. 4.7 that this is given by 


n2(1—cos@)+cos@ nyny(1—cos@) —n,sin@ nyn(1 — cos 0) + ny sin 8 
nynx(1 — cos 0) +n, sin ny (1 —cos@)+cos@ nyn-(1 —cos@) —n, sind 
nnx(1—cos 6) —ny sin@ n;ny(1— cos) +n, sin@ n2(1 — cos 0) + cos 0 

(4.24) 


where ñ = (nx, ny, nz) and the components of ñ are not all independent since n? + 
n? + n? = 1. This constraint, along with the three components of ñ and the angle 
0, gives us three free parameters with which to describe an arbitrary rotation, just as 
with the Euler angles. For a nice geometric interpretation of the above matrix, see 
Problem 4-3. 


Example 4.15. O(3) Orthogonal group in three-dimensions 


If SO(3) is the group of all three-dimensional rotations, then what are we to make 
of O(3), the group of all orthogonal 3 x 3 matrices without the restriction on the 
determinant? Well, as we pointed out in Example 3.29, the orthogonality condition 
actually implies? that |R| = +1, so in going from SO(3) to O(3) we are just adding 
all the orthogonal matrices with |R| = —1. These new matrices are sometimes 
referred to as improper rotations, as opposed to the elements with |R| = 1 which 


8Such as Goldstein [8]. 


°’This fact can be understood geometrically: since orthogonal matrices preserve distances and 
angles, they should preserve volumes as well. As we learned in Example 3.28, the determinant 
measures how volume changes under the action of a linear operator, so any volume preserving 
operator should have determinant +1. The sign is determined by whether or not the orientation is 
reversed. 
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are known as proper rotations. Now, amongst the improper rotations is our old 
friend the inversion transformation 


—1 0 0 
-I={ 0-10 
0 0 -1 


Any improper rotation can be written as the product of a proper rotation and the 
inversion transformation, as R = (—J)(—R) (note that if R is an improper rotation, 
then —R is a proper rotation). Thus, an improper rotation can be thought of as a 
proper rotation followed!” by the inversion transformation. 

One important feature of O(3) is that its two parts, the proper and improper 
rotations, are disconnected, in the sense that one cannot continuously go from 
matrices with |R| = 1 to matrices with |R| = —1. (If one can continuously go 
from one group element to any other, then the group is said to be connected. It 
is disconnected if it is not connected.) One can, however, multiply by —/ to go 
between the two components. This is represented schematically in Fig. 4.3. Note 
that the stipulation in our definition that a rotation must be continuously obtainable 
from the identity excludes all the improper rotations, as it should. 


Example 4.16. SU(2) Special unitary group in two complex dimensions 


As mentioned in Example 4.10, SU(2) is the group of all 2 x 2 complex matrices A 
which satisfy |A| = 1 and 


At = aA}, 


You can check (see Exercise 4.11 below) that a generic element of S U (2) looks like 


*P\ «pec, ja+lp=1. (4.25) 
oo T © 
< > 
proper rotations improper rotations 


Fig. 4.3 The two components of O(3). The proper rotations are just SO(3). Multiplying by the 
inversion transformation —J takes one back and forth between the two components 


10One can actually think of the inversion as following or preceding the proper rotation, since — 7 
commutes with all matrices. 
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We could also use three real parameters with no conditions rather than two complex 
parameters with a constraint; one such parametrization is 


( el VHA) cog 8 jei—9)/? gin 2 ) (4.26) 


ie it Y—9)/2 sin u e 1 V+9)/2 cos f 


where we have used the same symbols for our parameters as we did for the Euler 
angles. Another real parameterization is 


cos(0/2) — in; sin(0/2) (—inx —ny)sin(0/2) 2 2 2 
( (—inx +n,)sin(6/2) cos(@/2) + in, nA eTii = 
(4.27) 


where ñ = (nx,ny,nz) is a unit vector and @ is an arbitrary angle. The sim- 
ilarities between the SU(2) parameterizations (4.26) and (4.27) and the SO(3) 
parameterizations (4.23) and (4.24) is no accident, as there is a close relationship 
between SU(2) and SO(3), which we will discuss in detail in the next section. 
This relationship underlies the appearance of SU(2) in quantum mechanics, where 
rotations are implemented on spin 1/2 particles by elements of SU(2). In fact, 
we will see that a rotation with Euler angles ¢, 0, and w is implemented by the 
matrix (4.26), and a rotation of angle 0 about the axis ñ is implemented by the 
matrix (4.27)! 


Exercise 4.11. Consider an arbitrary complex matrix 


a p 
y ô 
and impose the unit determinant and unitary conditions. Show that (4.25) is the most general 


solution to these constraints. Then show that any such solution can also be written in the 
form (4.26). 


Box 4.2 SU(2) as the 3-sphere 
You may have noticed that the constraint |a|? + |6|? = 1 in (4.25) has a simple, 
symmetric form, which seems to hint at something deeper. Let us elaborate on 
this. 

If we write a and 6 in terms of real numbers as 


a=u+iv, b=x+iy 
then |a|* + |8|? = 1 becomes 
w+v4x74+y=1. (4.28) 


This equation is clearly analogous to the algebraic equations x? + y? = 1 
for the unit circle and x? + y? + z? = 1 for the surface of the unit sphere. 


4.3 The Groups of Classical and Quantum Physics 129 


Equation (4.28) is, in fact, the equation of the 3-sphere, which you can think 
of as a three-dimensional “spherical” space embedded in four-dimensions, just 
as the unit circle (or “1-sphere’’) is a one-dimensional space embedded in 2-D, 
and the surface of the unit sphere (or “2-sphere’’) is a two-dimensional space 
embedded in 3-D. Equations (4.25) and (4.28) tell us that 


SU(2) is the 3-sphere. 


What’s more, you can think of +7 € SU(2), coordinatized by (u, v, x, y) = 
+(1,0,0,0), as two “poles” of the 3-sphere, in analogy with the North and 
South poles (x,y,z) = +(0,0,1) of the 2-sphere. This is schematically 
illustrated in Fig. 4.4. 

Though one can take this much further and analyze the “spherical” 
geometry of the 3-sphere and its relation to the SU(2) group structure (see 
Frankel [6], section 21.4), our point here is that SU(2) can be thought of 
not just as a parametrized set of matrices but as a highly symmetric multi- 
dimensional space in its own right. This is also true for SO(2), which by (4.22) 
is clearly just the unit circle. It’s also true of other, more complicated groups 
like SO(3) and SO(3, 1), (see the next example), but in these cases there is no 
simple analogy between the structure of these groups and that of more familiar 
spaces, so they are harder to visualize. However, it’s important to remember 
that such interpretations of these groups do exist, and we will come back to this 
way of thinking in Sect. 4.5. 


Example 4.17. SO(3, 1), The restricted Lorentz group 


Fig. 4.4 The Lie group 
SU(2) represented as a 
multidimensional spherical 
object. One can identify 
SU(2) with the 3-sphere, as 
per (4.25) and (4.28), but this 
is difficult to visualize 
(especially on paper), so we 
schematically represent it 
here as the 2-sphere. The 
point is to think of matrix Lie 
groups not just as 
parametrized sets of matrices, 
but as multidimensional 
spaces in their own right = 
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The restricted Lorentz group SO(3, 1), is defined to be the set of all A € O(3, 1) 
which satisfy |A| = 1 as well as A44 > 1. You will verify in Problem 4-4 that 
SO(3, 1), is a subgroup of O(3, 1). Where does its definition come from? Well, 
just as O(3) can be interpreted physically as the set of all orthonormal coordinate 
transformations, O(3, 1) can be interpreted as the set of all transformations between 
inertial reference frames. However, we often are interested in restricting those 
transformations to those which preserve the orientation of time and space, which is 
what the additional conditions |A| = 1 and A44 > 1 do. The condition A44 > 0 
means that A doesn’t reverse the direction of time, so that clocks in the new 
coordinates aren’t running backwards. This, together with |A| = 1, then implies 
that A does not reverse the orientation of the space axes. Such transformations are 
known as restricted Lorentz transformations. 
The most familiar such transformation is probably 


10 0 0 
01 0 0 1 

L= =e. y = —— (4.29) 
00 y -fy J/1— B2 
00-By y 


which is interpreted passively!! as a coordinate transformation to a new reference 
frame that is unrotated relative to the old frame but is moving uniformly along the 
z-axis with relative velocity B.'? Such a transformation is often referred to as a boost 
along the z-axis, and is also sometimes written as 


10 0 0 
01 0 0 

L= l 
00 coshu —sinhu |? ” ER (4.30) 


0 0 —sinhu cosh u 
where u is a quantity known as the rapidity and is related to B by 
tanhu = £. 
(You should check that the above L matrices really are in SO(3, 1),.) We could 


also boost along any other spatial direction; if the relative velocity vector is 
B = (Bx, By, 2), then the corresponding matrix should be obtainable from (4.29) 


by an orthogonal similarity transformation that takes z into B. You will show in 


It’s worth noting that, in contrast to rotations, Lorentz transformations are pretty much always 
interpreted passively. A vector in R* is considered an event, and it doesn’t make much sense to start 
moving that event around in spacetime (the active interpretation), though it does make sense to ask 
what a different observer’s coordinates for that particular event would be (passive interpretation). 


Note that B is measured in units of the speed of light, hence the restriction —1 < B < 1. 
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Exercise 4.12 below that this yields 


f 1 x By(y-1 zp- 
yBx(y-1 Buy 1) yßz(y—1 
ByB WM ) TA 1 By B. vy ) B, 


i BByy-) B-D ii 
Bx (y-1 Py (Y— Fa am 
BB g ) i B2 +1 —By 
—Bxy —Byy —By y 
If we generalize the relation between u and £ to three-dimensions as 
tanh u 
B= u, u= ul (4.32) 
u 
then you can check that this arbitrary boost can also be written as 
us (cosh) +1 atty (costu D e -s sinh u 
Uy Ux (cosh u—1) w, (cosh u—1) uyuz(coshu—1) uy o: 
L= ; u2 u2 +1 5 > ue Tu sinh u 7 (4.33) 
(cosh u— zuy (cosh u— 2 (coshu—1 aon 
atts (coshu 1) Uzly com 1) uz cost u—1) + 1 -5 mhu 
— sinhu —* sinh uv —* sinhu coshu 


Note that the set of boosts is not closed under matrix multiplication; you could check 
this directly (and laboriously) using (4.31) or (4.33), but we’ll prove it more simply 
and elegantly in Example 4.33. 

Now we know what boosts look like, but how about an arbitrary restricted 
Lorentz transformation? Well, the nice thing about SO(3, 1), is that any element 
A can be decomposed as A = LR’, where 


R' = G :) Re SO(3) (4.34) 


and L is of the form (4.31). This is the usual decomposition of an arbitrary restricted 
Lorentz transformation into a rotation and a boost, which you will perform in 
Problem 4-5. Note that L has three arbitrary parameters, so that our arbitrary 
restricted Lorentz transformation LR’ has six parameters total. 


Exercise 4.12. Construct an orthogonal matrix A which implements an orthonormal 
change of basis from the standard basis {X, y, z} to one of the form {r1, r2, 8} where the r; 


are any two vectors mutually orthonormal with B and each other. Embed A in SO(3, 1), as 
in (4.34) and use this to obtain (4.31) by performing a similarity transformation on (4.29). 
Parts of Problem 3-1 may be useful here. 


Exercise 4.13. (Properties of Boosts) 


(a) Check that L in (4.31) really does represent a boost of velocity B as follows: Use L as a 
passive transformation to obtain new coordinates (x’, y’, z’, t’) from the old ones by 
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Il 
p 
Sa Ya 


Show that the spatial origin of the unprimed frame, defined by x = y = z = 0, moves with 
velocity —£ in the primed coordinate system, which tells us that the primed coordinate system 
moves with velocity +£ with respect to the unprimed system. 

(b) A photon traveling in the +z direction has energy-momentum 4-vector!? given by 
(E/c,0,0, E), where E is the photon energy and we have restored the speed of light c. 
By applying a boost in the z direction [as given by (4.29)] and using the quantum-mechanical 
relation E = hv between a photon’s energy and its frequency v, derive the relativistic doppler 


shift 


3 


Example 4.18. O(3, 1) The extended Lorentz group 


In the previous example we restricted our changes of inertial reference frame to 
those which preserved the orientation of space and time. This is sufficient in classical 
mechanics, but in quantum mechanics we are often interested in the effects of space 
and time inversion on the various Hilbert spaces we’re working with. If we add 
spatial inversion, also called parity and represented by the matrix 


-10 00 
P= ae 
0 0 -10 
0 0 01 


as well as time-reversal, represented by 


100 0 
010 0 

T= ; 4. 
001 0 oy) 


000-1 


to the restricted Lorentz group, we actually recover O(3, 1), which is thus known 
as the improper or extended Lorentz group. You should verify that P, T € O(3, 1), 


but P,T ¢ SO(3,1)o. In fact, |P| = |T| = —1, which is no accident; as in 
the case of the orthogonal group, the defining Eq. (4.18) restricts the determinant, 
and in fact implies that |A| = +1. In this case, however, the group has four 


'3See Griffiths [10], Ch. 12.2. 
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components instead of two! Obviously those matrices with |A| = 1 must be 
disconnected from those with |A| = —1, but those which reverse the orientation 
of the space axes must also be disconnected from those which do not, and those 
which reverse the orientation of time must be disconnected from those which do 
not. This is represented schematically in Fig. 4.5. Note that, as in the case of O(3), 
multiplication by the transformations P and T takes us to and from the various 
different components. 


|A| =1 |A| = -1 
A44>0 A44 >0 
P 
< > 
T O(3,1) fi 
P 
< > 
|A| = -1 |A| =1 
Aga <0 Aga <0 


Fig. 4.5 The four components of O(3, 1). The proper Lorentz transformations in the upper-left 
corner are just SO(3, 1)o. Note that the transformations in the lower-right hand corner change 
both the orientation of the space axes and time, and so must be disconnected from SO(3, 1), even 
though they have |A| = 1. Also note that multiplying by the parity and time-reversal operators P 
and T take one back and forth between the various components 
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Example 4.19. SL(2, C) Special linear group in two complex dimensions 


This group cannot be viewed as a group of isometries, but it is important in physics 
nonetheless. SL(2, C) is defined to be the set of all 2 x 2 complex matrices A with 
|A| = 1. By now it should be apparent that this set is a group. The general form of 
A € SL(2,C)is 


Am (25) a,b,c,d € C, ad — bc = 1. 
cd 

The unit determinant constraint means that A is determined by three complex 
parameters or six real parameters, just as for SO(3, 1),. This is no coincidence; in 
fact, SL(2, C) bears the same relationship to SO(3, 1), as SU(2) bears to SO(3), in 
that SL(2, C) implements restricted Lorentz transformations on spin 1/2 particles! 
This will be discussed in the next section. You will also show later'* that a boost 
with rapidity u is implemented by an SL(2, C) matrix of the form 


5 h? + &£sinh# —1(u, —iu,) sinh £ 
ad CTR a a a a 
ux y 2 2 u 2 


and it can be shown,!° just as for SO(3,1),, that any A € SL(2,C) can be 
decomposed as A = LR, where R € SU(2) and L is as above. This, together with 
the facts that an arbitrary rotation can be implemented by R € SU(2) parametrized 
as in (4.26), yields the general form LR for an element of SL(2, C) in terms of the 
same parameters we used for SO(3, 1)o. Oo 


Now that we have discussed several groups of physical relevance, it’s a good 
time to try and organize them somehow. If you look back over this section, you’ll 
see that each group implements either rotations or Lorentz transformations, but in 
different “flavors”. For instance, SO(3) are just the proper rotations, whereas O(3) 
includes the improper rotations, and SU(2) implements proper rotations but only 
for quantum-mechanical spin 1/2 particles. Analogous remarks, but for Lorentz 
transformations, apply to SO(3, 1),, O(3,1), and SL(2, C). Furthermore, since 
proper rotations are a subset of all rotations, and since rotations are just a restricted 
class of Lorentz transformations, there are various inclusion relations amongst these 
groups. These interrelationships are all summarized in Table 4.1. We will add one 
more set of relationships to this table when we study homomorphisms in the next 
section. 


'4See Problem 4-8. 
15See Problem 4-6. 
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Table 4.1 The interrelationships between the groups discussed in 
this section 


Rotations Lorentz transformations Description 
SU(2) C | SL(2,C) Quantum-mechanical 
SO(3) C | SO(3, 1). Proper 

N N — 
O(3) C | 0@G,1) Improper 


4.4 Homomorphism and Isomorphism 


In the last section we claimed that there is a close relationship between SU(2) and 
SO(3), as well as between SL(2, C) and SO(3, 1)o. We now make this relationship 
precise, and show that a similar relationship exists between S„ and Z2. We will also 
define what it means for two groups to be “the same”, which will then tie into our 
somewhat abstract discussion of Z» in the last section. 

Given two groups G and H, a homomorphism from G to H is a map 
®:G — H such that 


®(g122) = P(g1) P(g2) V 21,92 EG. (4.37) 


Note that the product in the left-hand side of (4.37) takes place in G, whereas 
the product on the right-hand side takes place in H. A homomorphism should be 
thought of as a map from one group to another which preserves the multiplicative 
structure. Note that ® need not be one-to-one or onto; if it is onto, then ® is said to 
be a homomorphism onto H, and if in addition it is one-to-one, then we say ® is 
an isomorphism. If ® is an isomorphism, then it is invertible and thus sets up a one- 
to-one correspondence which preserves the group structure, so we can then regard 
G and H as “the same” group, just with different labels for the elements. When two 
groups G and H are isomorphic we write G>H. 


Exercise 4.14. Let ® : G —> H be a homomorphism, and let e be the identity in G and e’ 
the identity in H. Show that 


(e) = e' 
gT) = dg YgEG. 


Example 4.20. Isometries and the orthogonal, unitary, and Lorentz groups 


A nice example of a group isomorphism is when we have an n-dimensional vector 
space V (over some scalars C) and a basis B, hence a map 


GL(V) > GL(n,C) 
TR [T]z. 
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It’s easily checked that this map is one-to-one and onto. Furthermore, it is a 
homomorphism since [TU] = [T][U], a fact you proved in Exercise 2.10. Thus 
this map is an isomorphism and GL(V) ~ GL(n,C). If V has a non-degenerate 
Hermitian form (-|-), then we can restrict this map to Isom(V) C GL(V), which 
yields 


Isom(V) ~ O(n) 
when V is real and (-| -) is positive-definite, 
Isom(V) ~ U(n) 
when V is complex and (- | -) is positive-definite, and 
Isom(V) ~ O(n — 1, 1) 
when V is real and (-| -) is a Minkoswki metric. These isomorphisms were implicit 


in the discussion of Examples 4.6—4.9, where we identified the operators in Isom(V ) 
with their matrix representations in the corresponding matrix group. Oo 


Example 4.21. Linear maps as homomorphisms 


A linear map from a vector space V to a vector space W isa map ® : V —> W that 
satisfies the usual linearity condition 


(cvi + v2) = cB(v1) + B(v2). (4.38) 


(A linear operator is then just the special case in which V = W.) In particular we 
have ®(v; + v2) = ®(v;) + (v2), which just says that ® is a homomorphism 
between the additive groups V and W! (cf. Example 4.5). If ® is one-to-one and 
onto, then it is an isomorphism, and in particular we refer to it as a vector space 
isomorphism. 

The notions of linear map and vector space isomorphism are basic ones, and 
could have been introduced much earlier (as they are in standard linear algebra 
texts), but because of our specific goals in this book we haven’t needed them yet. 
These objects will start to play a role soon, though, and will recur throughout the 
rest of the book. 


Exercise 4.15. Use an argument similar to that of Exercise 2.8 to prove that a linear map 
$ : V — W is an isomorphism if and only if dim V = dim W and ¢ satisfies 


(v) =0 = v=0. 


Exercise 4.16. Before moving on to more complicated examples, let’s get some practice 
by acquainting ourselves with a few more basic homomorphisms. 
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(a) First, show that the map 


exp: R > R* 


xe e*, 


from the additive group of real numbers to the multiplicative group of nonzero real numbers, 
is ahomomorphism. Is it an isomorphism? Why or why not? 

(b) Repeat the analysis from (a) for exp : C > C*. 

(c) Show that the map 


det : GL(n,C) > C* 
A +> det A 


is a homomorphism for both C = R and C = C. Is it an isomorphism in either case? Would 
you expect it to be? 


Exercise 4.17. Recall that U(1) is the group of 1 x 1 unitary matrices. Show that this is 
just the set of complex numbers z with |z| = 1, and that U(1) is isomorphic to SO(2). 


Suppose ® is a homomorphism but not one-to-one. Is there a way to quantify 
how far it is from being one-to-one? Define the kernel of ® to be the set 


K = {g € G| ®(g) = e'}, 


where e’ is the identity in H. In words, K is the set of all elements of G that get 
sent to e’ under ®; see Fig. 4.6. Note that e € K by Exercise 4.14. Furthermore, if 
® is one-to-one, then K = {e}, since there can be only one element in G that maps 
to e’. If ® is not one-to-one, then the size of K tells us how far it is from being so. 
Also, if we have ®(g;) = ®(g2) = h € H, then 


P(gigo ') = O(g1) P(g.) = hh"! =e 
so 212! is in the kernel of ®, i.e. gıg27! = k € K. Multiplying this on the right 
by g2 then gives gı = kgo, so we see that any two elements of G that give the 
same element of H under ® are related by left multiplication by an element of K. 
Conversely, if we are given g € G and ®(g) = h, then for all k € K, 


D(kg) = O(k) P(g) = e' O(g) = P(g) =h. 


Thus if we define! Kg = {kg|k € K}, then Kg are precisely those elements (no 
more, and no less) of G which get sent to h. This is also depicted in Fig. 4.6. Thus, 


'6We could also proceed by defining gK = {gk|k € K}, but this turns out to be the same as Kg. 
A subgroup K with this property, that Kg = gK Vg € G, is called a normal subgroup. For more 
on this, see Herstein [12]. 
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G H 


Fig. 4.6 Schematic depiction of the subsets K and Kg of G. They are mapped respectively to e” 
and h in H by the homomorphism ® 


the size of K tells us how far ® is from being one-to-one, and the elements of K tell 
us exactly which elements of G will map to a specific element h € H. 

Homomorphisms and isomorphisms are ubiquitous in mathematics and occur 
frequently in physics, as we’ll see in the examples below. 


Exercise 4.18. Show that the kernel K of any homomorphism ® : G — H is a subgroup 
of G. Then determine the kernels of the maps exp and det of Exercise 4.16. 


Exercise 4.19. Suppose ® : V — W is a linear map between vector spaces, hence 
a homomorphism between abelian groups. Conclude from the previous exercise that the 
kernel K of ® is a subspace of V, also known as the null space of ®. The dimension of 
K is known as the nullity of K. Also show that the range of ® is a subspace of W, whose 
dimension is known as the rank of ®. Finally, prove the rank-nullity theorem of linear 
algebra, which states that 


rank(®) + nullity(®) = dim V. (4.39) 


(Hint: Take a basis {e;,--+ , e} for K and complete it to a basis {e),--- , en} for V, where 
n = dim V. Then show that {®(e,4),-++ , ®(e,)} is a basis for the range of ®.) 


Example 4.22. SU(2) and SO(3) 


In most physics textbooks the relationship between SO(3) and S U(2) is described in 
terms of the “infinitesimal generators” of these groups. We will discuss infinitesimal 
transformations in the next section and make contact with the standard physics pre- 
sentation then; here, we present the relationship in terms of a group homomorphism 
p : SU(2) —> SO(3), defined as follows: consider the vector space (check!) of 
all 2 x 2 traceless anti-Hermitian matrices, denoted as su(2) (for reasons we will 
explain later). You can check that an arbitrary element X € su(2) can be written as 


ao a ae x,y, ZER. (4.40) 


—ix 1Z 
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If we take as basis vectors 


i 1/0 -i 
8.5500 = 5 (5, a) 


l iiie 
s =-= J (4.41) 


then we have 
X =xS; + ysy 4+ 28S; 


so the column vector corresponding to X in the basis B = {S,, Sy, Sz} is 
x 
[IX]=| y 
z 


Note that 
l2 2 2 1 2 
det X = 3C +y +z )= zl XIII 


so the determinant of X € su(2) is proportional to the norm squared of [X] € R? 
with the usual Euclidean metric. Now, you will check below that A € SU(2) acts 
on X € su(2) by the map X +> AXA’, and that this map is linear. Thus, this map is 
a linear operator on su(2), and can be represented in the basis B by a 3 x 3 matrix 
which we’ll call p(A), so that 


[AXA] = p(A)[X] 
where p(A) acts on [X] by the usual matrix multiplication. Furthermore, 
[|PCAVEXIIP = [TAXA]? = 4det(4X4t) = 4det X = ||[X]]||? (4.42) 


so p(A) preserves the norm of X . This implies (see Exercise 4.21 below) that p(A) € 
O(3), and one can in fact show!” that det (A) = 1, so that p(A) € SO(3). In fact, 


See Problem 4-7 of this chapter, or consider the following rough (but correct) argument: p : 
SU(2) — O(3) as defined above is a continuous map, and so the composition 


detop: SU(2) > R 
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if A is as in (4.26), then (A) is just (4.23) (Problem 4-7). Thus we may construct a 
map 


p: SU(2) > SO(3) 
At p(A). 


Furthermore, p is a homomorphism, since 


p(AB)[X] = [(AB)X(AB)'] = [ABXB+ At] = p(A)[BXB"] = p(A)p(B)[X] 
(4.43) 
and hence p(AB) = p(A)p(B). Is p an isomorphism? One can show'® that p is 
onto but not one-to-one, and in fact has kernel K = {/,—J}. From the discussion 
preceding this example, we then know that p(A) = p(—A) VA e SU(2) (this 
fact is also clear from the definition of p), so for every rotation R € SO(3) there 
correspond exactly two matrices in SU(2) which map to R under p. Thus, when 
trying to implement a rotation R on a spin 1/2 particle we have two choices for 
the SU(2) matrix we use, and it is sometimes said that the map p~! from SO(3) 
to SU(2) is double-valued. In mathematical terms one doesn’t usually speak of 
functions with multiple-values, though, so instead we say that S'U(2) is the double- 
cover of SO(3), since the map p is onto (“cover”) and two-to-one (“double”). 


Box 4.3 Interpreting the Map p 

Note that the S; of Eq. (4.41) are, up to a factor of i, just the spin 1/2 angular 
momentum matrices. Then, noting that At = A7!, the map X => AXA = 
AXA! may remind you of a rotation of spin operators, which is just what it 
is. This connection will be explained when we discuss the Ad homomorphism 
in Example 4.43. 


Exercise 4.20. Let A € SU(2), X € su(2). Show that AXA* € su(2) and note that 
A(X + Y)A* = AXA? + AYA!, so that the map X —> AXA’ really is a linear operator 
on su(2). 


Exercise 4.21. Let V be a real vector space with a metric g and let R € L(V) preserve 
norms on V, i.e. g(Rv, Rv) = g(v,v) Vu € V. Show that this implies that 


g(Rv, Rw) = g(v,w) Yv,w E V, 


i.e. that R is an isometry. Hint: consider g(v + w, v + w) and use the bilinearity of g. 


A +> det(p(A)) 


is also continuous. Since SU(2) is connected any continuous function must itself vary con- 
tinuously, so deto p can’t jump between 1 and —1, which are its only possible values. Since 
det(p(7)) = 1, we can then conclude that det(e(A)) = 1 VA € SU(2). 


'8See Problem 4-7 again. 
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Example 4.23. SL(2,C) and SO(3, 1). 


Just as there is a two-to-one homomorphism from SU(2) to SO(3), there is a 
two-to-one homomorphism from SL(2,C) to SO(3,1), which is defined sim- 
ilarly. Consider the vector space H(C) of 2 x 2 Hermitian matrices. As we 
saw in Example 2.10, this is a four-dimensional vector space with basis B = 
{Ox, Oy, 0z, 1}, and an arbitrary element X € H2(C) can be written as 


x=( 7 x—iy 


: ) = x0, + yoy + zo, + tI (4.44) 
xXx+1y t-z i 


so that 


[X] = 


~a << & 


Now, SL(2, C) acts on H>(C) in the same way that S U(2) acts on su(2): by sending 
X — AXA‘ where A € SL(2,C). You can again check that this is actually a linear 
map from H>(C) to itself, and hence can be represented in components by a matrix 
which we’ll again call p(A). You can also check that 


det X = 1? — x? =j = —n([X], [X]) 


so the determinant of X € H»(C) gives minus the norm squared of [X] in the 
Minkowski metric on Rt. As before, the action of p(A) on [X] preserves this 
norm [by a calculation identical to (4.42)], and you will show in Problem 4-8 that 
det p(A) = 1 and p(A)as > 1, so p(A) € SO(3, 1)o. Thus we can again construct a 
map 


p: SL(2,C) > SOQ, 1)o 
A > (A) 


and it can be shown!” that p is onto. Furthermore, p is a homomorphism, by a 
calculation identical to (4.43). The kernel of p is again K = {J, —I} so SL(2, C) is 
a double-cover of SO(3, 1)o. 


Exercise 4.22. The elements of SL(2, C) that correspond to rotations should fix timelike 
vectors, i.e. leave them unchanged. Identify the elements of H»(C) that correspond to 
timelike vectors, and show that it is precisely the subgroup SU(2) C SL(2,C) which 
leaves them unchanged, as expected. 


19See Problem 4-8 again. 
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Now that we have met the homomorphisms p : SU(2) — SO(3) and p : 
SL(2,C) — SO(3, 1),, we can complete Table 4.1 by including those maps. This 
yields Table 4.2 below. 


Table 4.2 As in Table 4.1, but with the homomorphisms p 
SU(2) > SO(3) and p : SL(2,C) > SO(3, 1), 


Rotations Lorentz transformations | Description 
SU(2) C | SLQ2,C) Quantum-mechanical 
A, | el 
SO(3) C | SO(3, 1), Proper 
N N - 
O(3) C | O(3,1) Improper 


This table summarizes the interrelationships between the matrix Lie 
groups introduced so far 


Example 4.24. Z», parity, and time-reversal 


Consider the set {7, P} C O(3, 1). This is an abelian group of two elements with 
P?=I, and so looks just like Z2. In fact, if we define a map © : {7, P} > Z3 by 


(I) = 1 
©(P) = -1 


then ® is ahomomorphism since 


(P - P) = O(I) = 1 = (-1)? = ®(P)@(P). 


@ is also clearly one-to-one and onto, so ® is in fact an isomorphism! We could also 
consider the two-element group {7, T} € O(3, 1); since T? = I, we could define a 
similar isomorphism from {7, T} to Z2. Thus, 


Zoxtl, Pi}, T}. 


In fact, you will show below that all two element groups are isomorphic. That is 
why we chose to present Z2 somewhat abstractly; there are many groups in physics 
that are isomorphic to Z2, so it makes sense to use an abstract formulation so that 
we can talk about all of them at the same time without having to think in terms of a 
particular representation like {7, P} or {7, T}. We will see the advantage of this in 
the next example. 


Exercise 4.23. Show that any group G with only two elements e and g must be isomorphic 
to Z2. To do this you must define a map ® : G —> Z2 which is one-to-one, onto, and which 
satisfies (4.37). Note that Sy, the symmetric group on two letters, has only two elements. 
What is the element in S2 that corresponds to —1 € Z2? 
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Example 4.25. S,,, Z2, and the sgn homomorphism 


In Sect. 3.8 we discussed rearrangements, transpositions, and the evenness or 
oddness of a rearrangement in terms of the number of transpositions needed to 
obtain it. We are now in a position to make this much more precise, which will 
facilitate neater descriptions of the € tensor, the determinant of a matrix, and the 
symmetrization postulate. 

We formally define a transposition in S, to be any permutation t which 
switches two numbers i and j and leaves all the others alone. (We’ll usually denote 
transpositions as t and more general permutations as o.) You can check that oz and 
01 - 02 from Example 4.12 are both transpositions. It is a fact that any permutation 
can be written (non-uniquely) as a product of transpositions. Though we won’t 
prove this here,” the following argument should make this fact plausible: a per- 
mutation o corresponds to a rearrangement {o (1), o (2), ..., o (n)} of the numbers 
{1,2,...,m}. A transposition corresponds to switching any two numbers in the 
ordered list {1,2,...,}. Given a particular rearrangement {o(1),0(2),...,0()}, 
we can build it by starting from {1,2,...,}, repeatedly switching the “1” with its 
neighbor to the right until it gets to the right place, then switching the “2” with its 
neighbor to the right until it gets to the right place, and repeating until we arrive 
at {0(1),0(2),...,0(m)}. In this way {o(1),0(2),...,0(”)} can be written as a 
product of transpositions. As an example, you can check that 


o= 123\ (123 123 
= A231) \321)\213 
is a decomposition of 0; into transpositions. 
Though the decomposition of a given permutation is far from unique (for 
instance, the identity can be decomposed as a product of any transposition o and its 


inverse), the evenness or oddness of the number of transpositions in a decomposition 
is invariant. For instance, even though we could write the identity as 


e=e 

e=T ax 

e= ut nn! 
and so on for any transpositions T1, T2,..., every decomposition will consist of an 
even number of transpositions. Likewise, any decomposition of the transposition 
G f >) will consist of an odd number of transpositions. A general proof of this 


fact is relegated to Problem 4-9. What this allows us to do, though, is define a 
homomorphism sgn: S,, —> Z2 by 


20See Herstein [12] for a proof and nice discussion. 
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sgn(o) = +1 if o consists of an even number of transpositions 
: —1 if ø consists of an odd number of transpositions. 


You should be able to verify with a moment’s thought that 
sgn(o|02) = sgn(o1)sgn(o2), 


so that sgn is actually a homomorphism. If sgn(o) = +1 we say o is even, and if 
sgn(o) = —1 we say o is odd. 

With the sgn homomorphism in hand we can now tidy up several definitions 
from Sect. 3.8. First of all, we can now define the wedge product of r dual vectors 
fi, i =1,...,r to be 


fh^... Af E= > sgn(o) for) ® fo) @...® for) 


o€S, 


You should compare this with the earlier definition and convince yourself that the 
two definitions are equivalent. Also, from our earlier definition of the € tensor it 
should be clear that the nonzero components of € are 


€i,..i, = sgn(o) where o = (; Ei ") 
Ty ++ dy 


and so the definition of the determinant, (3.73), becomes 


|A| = > sen(o) Ato -- Ano: 


o€S, 


Finally, at the end of Example 4.12 we described how S, acts on an n-fold tensor 
product 7,°(V) by 


o(vi ® v2 ® exs ® Un) = Vo (1) ® Uo (2) ® tee ® Vo(n)s (4.45) 


and extending linearly. If we have a totally symmetric tensor T = T"'"e;, 8...8 
ei, E€ S"(V), we then have 


o(T) = Tiesi) 8... Q es(in) 

= TOTU Ue, &...Qe;, where we relabel indices using ją = o (ix) 
= Tihe; @... Bej by total symmetry of T47» 

=T 
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so all elements of S”(V) are fixed by the action of S,,. If we now consider a totally 
antisymmetric tensor T = T'!"e;, @...@e;, € A"(V), then the action of o € Sn 
on it is given by 
o(T)= Th esci) ® ... ® eslir) 
= TO UDO Ug Q...@ ej, 
sgn(a) T "ej, @... ® ej, by antisymmetry of T74» 
sen(o) T 


II 


so if ø is odd then T changes sign under it, and if ø is even then T is invariant. 
Thus, we can restate the symmetrization postulate as follows: 


Symmetrization Postulate II: Any state of an n-particle system is either 
invariant under a permutation of the particles (in which case the particles are 
known as bosons), or changes sign depending on whether the permutation is 
even or odd (in which case the particles are known as fermions). 


Switching now to Dirac notation, if we furthermore want to comply with the sym- 
metrization postulate and construct a totally symmetric/anti-symmetric n-particle 
state |W) out of n states |W), |W2), ---|W_), we can simply write 


Iv) = $ ooto) Wom) € 0) 


o€S, 


or 


Iv) = 55 sgn(o)|Wor)| oe) +++ |Wom) € A0). 


OESn 


4.5 From Lie Groups to Lie Algebras 


You may have noticed that the examples of groups we met in the last two sections 
had a couple of different flavors: there were the matrix groups like S U (2) and SO (3) 
which were parametrizable by a certain number of real parameters, and then there 
were the “discrete” groups like Z) and the symmetric groups S,, that had a finite 
number of elements and were described by discrete labels rather than continuous 
parameters. The first type of group forms an extremely important subclass of groups 
known as Lie Groups, named after the Norwegian mathematician Sophus Lie who 
was among the first to study them systematically in the late 1800s. Besides their 
ubiquity in math and physics, Lie groups are important because their continuous 
nature means that we can study group elements that are “infinitely close” to the 
identity; these are known to physicists as the “infinitesimal transformations” or 
“generators” of the group, and to mathematicians as the Lie algebra of the group. 
As we make this notion precise, we'll see that Lie algebras are vector spaces and 
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as such are sometimes simpler to understand than the “finite” group elements. Also, 
the “generators” in some Lie algebras are taken in quantum mechanics to represent 
certain physical observables, and in fact almost all observables can be built out 
of elements of certain Lie algebras. We will see that many familiar objects and 
structures in physics can be understood in terms of Lie algebras. 

Before we can study Lie algebras, however, we should make precise what we 
mean by a Lie group. Here we run into a snag, because the proper and most general 
definition requires machinery well outside the scope of this text.?! We do wish to be 
precise, though, so we follow Hall [11] and use a restricted definition which doesn’t 
really capture the essence of what a Lie group is, but which will get the job done 
and allow us to discuss Lie algebras without having to wave our hands. 

That said, we define a matrix Lie group to be a subgroup G C GL(n, C) which 
is closed, in the following sense: for any sequence of matrices A, € G which 
converges to a limit matrix A, either A € G or A € GL(n, C). All this says is that a 
limit of matrices in G must either itself be in G, or otherwise be noninvertible. As 
remarked above, this definition is technical and doesn’t provide much insight into 
what a Lie group really is, but it will provide the necessary hypotheses in proving 
the essential properties of Lie algebras. 

Let’s now prove that some of the groups we’ve encountered above are indeed 
matrix Lie groups. We’ll verify this explicitly for one class of groups, the orthogonal 
groups, and leave the rest as problems for you. The orthogonal group O(n) is 
defined by the equation R7! = RT, or RTR = I. Let’s consider the function from 
GL(n, R) to itself defined by f(A) = A’ A. Each entry of the matrix f(A) is easily 
seen to be a continuous function of the entries of A, so f is continuous. Consider 
now a sequence R; in O(n) that converges to some limit matrix R. We then have 


fR = f (im R; ) 


i—>00 
= lim f(R;) since f is continuous 
i—>oo 
= lim 7 
1—00 


=I 


so R € O(n). Thus O(n) is a matrix Lie group. The unitary and Lorentz groups, 
as well as their cousins with unit determinant, are similarly defined by continuous 
functions, and can analogously be shown to be matrix Lie groups. For an example 
of a subgroup of GL(n, C) which is not closed, hence not a matrix Lie group, see 
Problem 4-10. 


21The necessary machinery being the theory of differentiable manifolds; in this context, a Lie group 
is essentially a group that is also a differentiable manifold. See Schutz [18] or Frankel [6] for very 
readable introductions for physicists, and Warner [21] for a systematic but terse account. 
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We remarked that the above definition doesn’t really capture the essence of what 
a Lie group is. What is that essence? As mentioned before, one should think of Lie 
groups as groups which can be parametrized in terms of a certain number of real 
variables. This number is known as the dimension of the Lie group, and we will see 
that this number is also the usual (vector space) dimension of its corresponding Lie 
algebra. This parameterization is totally analogous to the way one parameterizes the 
surface of a sphere with polar and azimuthal angles in vector calculus in order to 
compute surface integrals. In fact, we already showed in Box 4.2 that one can (and 
should) think of a Lie group as a kind of multidimensional space (like the surface 
of a sphere) that also has a group structure. Unlike the surface of a sphere, though, 
a Lie group has a distinguished point, the identity e. Furthermore, as we mentioned 
above, studying transformations “close to” the identity will lead us to Lie algebras. 

For the sake of completeness, we should point out here that there are Lie groups 
out there which are not matrix Lie groups, i.e. which cannot be described as a subset 
of GL(n, C) for some n. Their relevance for basic physics has not been established, 
however, so we don’t consider them here.” 

Now that we have a better sense of what Lie groups are, we'd like to zoom in 
to Lie algebras by considering group elements that are “close” to the identity. For 
concreteness consider the rotation group SO(3). An arbitrary rotation about the z 
axis looks like 


cos —sin6 0 
RAO) = | sin@ cos 0 
0 0 1 


and if we take our rotation angle to be € « 1, we can approximate R,(€) by 
expanding to first order, which yields 


dR, 
R-(€) ~ RO) + € — =I+eL, (4.46) 
dO |o= 
where you may recall from Sect. 4.1 that 
he) = ; Fi 
49 leo 000 


Now, we should be able to describe a finite rotation through an angle @ as an 
n-fold iteration of smaller rotations through an angle 0/n. As we take n larger, then 
0/n becomes smaller and the approximation (4.46) for the smaller rotation through 
€ = 0/n becomes better. Thus, we expect 


22See Hall [11] for further information and references. 
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R(6) = [R; (6/n)]" ~ (: 7 al 


to become an equality in the limit n — oo. However, you should check, if it is not 
already a familiar fact, that 


; xN” o0 xn x 
lim (1 4: *) — eis (4.47) 
for any real number or matrix X. Thus, we can write 


R(0) = ef (4.48) 


where from here on out the exponential of a matrix or linear operator is defined by 
the power series in (4.47). Notice that the set {R.(9) = e°“=|@ € R} is a subgroup 
of SO(3); this can be seen by explicitly checking that 


R.(01) R-(02) = R:(01 + 62), (4.49) 


or using the property” of exponentials that eXe’ = e**Y, or recognizing 
intuitively that any two successive rotations about the z-axis yields another rotation 
about the z-axis. Notice that (4.49) says that if we consider R,(@) to be the image of 


0 under a map 
R, : R > SO(3), (4.50) 


then R, is ahomomorphism! Any such continuous homomorphism from the additive 
group R to a matrix Lie group G is known as a one-parameter subgroup. One- 
parameter subgroups are actually very familiar to physicists; the set of rotations in 
R? about any particular axis (not just the z-axis) is a one-parameter subgroup (where 
the parameter can be taken to be the rotation angle), as is the set of all boosts in a 
particular direction (in which case the parameter can be taken to be the absolute 
value u of the rapidity). We’ll see that translations along a particular direction in 
both momentum and position space are one-parameter subgroups as well. 
If we have a matrix X such that e’* € G Vt € R, then the map 


exp: R->G 
ieee (4.51) 
is a one-parameter subgroup, by the abovementioned property of exponentials. 


Conversely, if we have a one-parameter subgroup y : R —> G, then we know that 
y(0) = I (since y is a homomorphism), and, defining 


3This is actually only true when X and Y commute; more on this later. 
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_ ay 
X= rO (4.52) 
we have 
dy, n yh+t) -yt 
dt () mn h 
— lim YOO 
= lim = 
h—>0 h 
_ a, y(h)-—I 
ks ae am 
n y(h) — y(0) 
-ma ea 
= Xy(t). 


You should recall** that the first order linear matrix differential equation X (E) = 
Xy(t) has the unique solution y(t) = e’*. Thus, every one-parameter subgroup 
is of the form (4.51), so we have a one-to-one correspondence between one- 
parameter subgroups and matrices X such that e’* € G Yt € R. The matrix X 
is sometimes said to “generate” the corresponding one-parameter subgroup” [e.g., 
L, “generates” rotations about the z-axis, according to (4.48)], and to each X there 
corresponds an “infinitesimal transformation” 7 + €X. Really, though, X is best 
thought of as a derivative, as given by (4.52) and emphasized in Sect. 4.1. 

Any X that generates a one-parameter subgroup also carries a geometric inter- 
pretation, which we won’t make precise? but describe here as a useful heuristic. 
If we think of G as a multidimensional space, and our one-parameter subgroup 
y(t) = e'* as a parameterized curve in that space, then (4.52) tells us that X is 
a tangent vector to y(t) at the identity. We can then interpret all such X this way, 
and it turns out that these X actually compose the entire tangent space to G at e, 
pictured in Fig. 4.7. We are thus led to a vector space of matrices corresponding 
to one-parameter subgroups, and this is precisely the Lie algebra of the matrix Lie 


group. 


?4See any standard text on linear algebra and linear ODEs. 


>From (4.52) it’s hopefully also clear that any such X also “generates” the transformations 
corresponding to the one-parameter subgroup, in the sense described in Sect. 4.1. 


26See Frankel [6] for a detailed discussion. 
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Fig. 4.7 Schematic 
representation of a Lie group 
G, depicted here as a sphere, 
along with its Lie algebra g, 
given by the plane tangent to 
the identity e. A 
one-parameter subgroup e’* 
is shown running through e, 
and its tangent vector at e 

is X 


4.6 Lie Algebras: Definition, Properties, and Examples 


In accordance with the previous section, we define the Lie algebra g of a given 
matrix Lie group G C GL(n, C) as: 


g={X €M,(C)|e* «€GVteR}. (4.53) 


In this section we’ll establish some of the basic properties of the Lie algebras of the 
various isometry groups we’ve met—O(n), U(n), O(n — 1,1), SO(n), etc. We'll 
find that these Lie algebras have certain properties in common, and we’ll prove that 
in fact all Lie algebras of matrix Lie groups have these properties. The discussion 
in this section will also set us up for a detailed look in Sect. 4.7 at the Lie algebras 
of our specific groups from Sect. 4.3. 

Before looking at isometries, let’s warm up by considering the more general 
group of invertible linear operators GL(n, C), and its Lie algebra gl(n,C). How 
can we describe gl(n, C)? If X is any element of M,,(C), then e’ is invertible for 
all f, as its inverse is simply e™** . Thus 


gl(n,C) = M,(C). (4.54) 
Likewise, 
gl(n,R) = M,R). 


Thus, the Lie algebra of the group of invertible linear operators is simply the space 
of all linear operators! 
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Now, consider a finite-dimensional vector space V equipped with a 
non-degenerate Hermitian form (-|-) . How can we describe the Lie algebra 
associated with the isometry group Isom(V)??’ If X is in this Lie algebra, 
then (4.53) and (4.10) tell us that 


(e* view) = (wlw) Vu,weV, t eR. 
Differentiating with respect to t yields 
(Xe viw) + (v[Xe*w) =0 Vu,weV,t eR. 


Evaluating at £ = 0 and rearranging gives 


(Xvlw) = —(|Xw) Vu,we V. (4.55) 


This is the fundamental description of any Lie algebra of isometries, and will take 
various concrete forms as we proceed through the examples below. The minus sign, 
in particular, will play a key role. 


Example 4.26. o(n) The Lie algebra of O(n) 
Let V = R” with the Euclidean dot product (-|-), and denote the Lie algebra of 
O(n) by o(n). If we take an orthonormal basis and use X to denote both the matrix 
X € o(n) and the corresponding linear operator on R”, (4.55) becomes 
(X[v])7 [w] = —[v]’ X[w] Vu,w € R" 
<> [v] X" fw] = -[v]’ X[w] Vu,w € R" 
= XT = -X (4.56) 


where the last line follows from the same logic as in Exercise 4.4. Thus: 
The Lie algebra of O(n) is the set of n x n antisymmetric matrices. 


You can check that the matrices A;; from Example 2.9 form a basis for o(n), so 


that dimo(n) = aED, As we’ll see in detail in the n = 3 case, the A;; can be 
interpreted as generating rotations in the i — j plane. 


Example 4.27. u(n) The Lie algebra of U(n) 


Now let V = C” with the Hermitian inner product (-|-), and denote the Lie 
algebra of U(n) by u(n). For X € u(n) consider its Hermitian adjoint Xt, which 


27We will freely use the identification between finite-dimensional linear operators and matrices 
here, so that we may think of Isom(V) as a matrix Lie group and hence consider its Lie algebra. 
The exponential of linear operators is defined by (4.47), just as for matrices. 
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satisfies (2.49). Again identifying matrices and linear operators via an orthonormal 
basis (4.55) yields 


(v|Xtw) = —(v|Xw) 
= X't=-X (4.57) 


again by the same logic as in Exercise 4.4. In other words: 


The Lie algebra of U(n) is the set of n x n anti-Hermitian matrices. 


In Exercise 2.6 you constructed a basis for H,(C), and as we mentioned in 
Example 2.4, multiplying a Hermitian matrix by 7 yields an anti-Hermitian matrix. 
Thus, if we multiply the basis from Exercise 2.6 by i we get a basis for u(n), and it 
then follows that dim u(n) = n?. Note that a real anti-symmetric matrix can also be 
considered an anti-Hermitian matrix, so that o(n) C u(n). o 


Box 4.4 The Physicist’s Definition of a Lie Algebra 

Before meeting some other Lie algebras, we have a loose end to tie up. We 
claimed earlier that most physical observables are best thought of as elements 
of Lie algebras (we’ll justify this statement in Sect. 4.8) . However, we know 
that those observables must be represented by Hermitian operators (so that 
their eigenvalues are real), and yet we just saw that the elements of o(n) and 
u(n) are anti-Hermitian! This is where our mysterious factor of i, which we 
mentioned in the last section, comes into play. Strictly speaking, o (n) and u(n) 
are real vector spaces (check!) whose elements are anti-Hermitian matrices. 
However, if we permit ourselves to ignore the real vector space structure and 
treat these matrices just as elements of M,,(C), we can multiply them by į to get 
Hermitian matrices, which we can then take to represent physical observables. 
In the physics literature, one usually turns this around and defines generators 
of transformations to be these Hermitian observables, and then multiplies by 
i to get an anti-Hermitian operator, which can then be exponentiated into an 
isometry. So one could say that the physicist’s definition of the Lie algebra of 
a matrix Lie group G is 


physics = {X € M,(C)| a eG yie R}. 
In the rest of this book we will stick with our original definition of the Lie 


algebra of a matrix Lie group, but you should be aware that the physicist’s 
definition is often implicit in the physics literature. 
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Example 4.28. o(n — 1,1) The Lie algebra of O(n — 1,1) 


Now let V = R” with the Minkowski metric 7, and let o(m — 1,1) denote the 
Lie algebra of the Lorentz group O(n — 1,1). For X € o(n — 1,1), and letting 
[n] = Diag(1,1..., 1, —1) (see (4.60) below), Eq. (4.55) becomes 


u] X miw = -o [|X] Vv.w eR" 
< XT in = —[n]X. (4.58) 


Writing X out in block form with a (n — 1) x (n — 1) matrix X’ for the spatial 
components (i.e., the X;; where i, j < n), and vectors a and b for the components 
Xin and X,;,1 < n, this reads 


x7” b \_ (X a 
a —Xnn = —b —Xnn 


This implies that X has the form 


/ 
x=(% i X'’€o(n—1), ae R. 
a 0 


One can think of X’ as generating rotations in the n — 1 spatial dimensions, and a 
as generating a boost along the direction it points in R’~!. We will discuss this in 
detail in the case n = 4 in the next section. E 


We have now described the Lie algebras of the isometry groups O(n), U(n), 
and O(n — 1, 1). What about their cousins SO(n) and SU(n)? Since these groups 
are defined by the additional condition that they have unit determinant, examining 
their Lie algebras will require that we know how to evaluate the determinant of 
an exponential. This can be accomplished via the following beautiful and useful 
formula: 


Proposition 4.1. For any finite-dimensional matrix X, 

dete = e™*, (4.59) 
A general proof of this is postponed to Example 4.44 and Problem 4-14, but we 
can gain some insight into the formula by proving it in the case where X is 


diagonalizable. Recall that diagonalizable means that there exists A € GL(n,C) 
such that 


AXA! = Diag(Aq, A2,...,An) 


where Diag(A,, A2,...,4,) is a diagonal matrix with 4; in the ith row and column, 
i.e. 
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At 


Diag(A;,A2,...,An) = A2 f (4.60) 


In this case, we have 


dete* = det(Ae* A!) 


-1 
= dete4*4 


= det eDiagQ1.A2 piss An) check! 
= det Diag(e”! ; ee, e^) (see exercise below) 
= e^! e™2 DAE en 


eñ titt Àn 


2s ght (4.61) 


and so the formula holds. Oo 


Exercise 4.24. Prove (4.61). That is, use the definition of the matrix exponential as a power 
series to show that 


eDiag(A1 Ar eins An) — Diag(e*! , e2 Snake e^). 
In addition to being a useful formula, Proposition 4.6 also provides a nice 


geometric interpretation of the Tr functional. Consider an arbitrary one-parameter 
subgroup {e’*} C GL(n, C). We have 


det e’* = eT = ef TX 


’ 


so taking the derivative with respect to t and evaluating at t = 0 gives 


d 
— dete” =TrX. 
dt wee 


Since the determinant measures how an operator or matrix changes volumes (cf. 
Example 3.28), this tells us that the trace of the generator X gives the rate at which 
volumes change under the action of the corresponding one-parameter subgroup e** . 


Example 4.29. so(n) and su(n), the Lie algebras of SO(n) and SU(n) 


Recall that SO(n) and SU(n) are the subgroups of O(n) and U(n) consisting of 
matrices with unit determinant. What additional condition must we impose on the 
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generators X to ensure that det eX = 1 Vt? From (4.59), it’s clear that dete’* = 
1 V ¢ if and only if 


TrX¥ =0. 


In the case of o(n), this is actually already satisfied, since the anti-symmetry 
condition implies that all the diagonal entries are zero. Thus, so(n) = o(n), and 
both Lie algebras will henceforth be denoted as so(m). We could have guessed 
that they would be equal; as discussed above, generators X are in one-to-one 
correspondence with “infinitesimal” transformations J + €X, so the Lie algebra just 
tells us about transformations that are close to the identity. However, the discussion 
from Example 4.15 generalizes easily to show that O(n) is a group with two 
components, and that the component which contains the identity is just SO(n). 
Thus the set of “infinitesimal” transformations of both groups should be equal, and 
so should their Lie algebras. The same argument applies to SO(3, 1), and O(3, 1); 
their Lie algebras are thus identical, and will both be denoted as so0(3, 1). 

For su(m) the story is a little different. Here, the anti-hermiticity condition 
only guarantees that the trace of an anti-Hermitian matrix is pure imaginary, so 
demanding that it actually be zero is a bona fide constraint. Thus, su(7) can without 
redundancy be described as the set of traceless, anti-Hermitian n x n matrices. The 
tracelessness condition provides one additional constraint beyond anti-hermiticity, 
so that 


dim su(n) = dimu(n) — 1 = n? — 1. 


Can you find a nice basis for su(”)? Oo 


Now that we are acquainted with the Lie algebras of our favorite matrix Lie 
groups (viewed as isometry groups), it is time to point out some common features 
that they share. First off, they are all real vector spaces, as you can easily check.”® 
Secondly, they are closed under commutators, in the sense that if X and Y are 
elements of the Lie algebra, then so is 


[X,Y] = XY - YX. 


You will check this in Exercise 4.25 below. Thirdly, these Lie algebras (and, in fact, 
all sets of matrices) satisfy the Jacobi Identity, 


[X,Y], Z] + [YZ], X] + [[Z, X], Y] =0 Y X,Y,Z €g. (4.62) 
This can be verified directly by expanding the commutators, and doesn’t depend on 


any special properties of g, only on the definition of the commutator. We will have 
more to say about the Jacobi identity later. 


28 a(n, C) can also be considered a complex vector space, but we won’t consider it as such in this 
text. 
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Exercise 4.25. (a) Show directly from the defining conditions (4.56), (4.57), and (4.58) that 
so(n), u(n), and so(m — 1,1) are vector spaces, i.e. closed under scalar multiplication and 
addition. Impose the additional tracelessness condition and show that su(7) is a vector space 
as well. 

(b) Similarly, show directly from the defining conditions that so(v), u(n), and so(n — 1, 1) are 
closed under commutators. 

(c) Prove the cyclic property of the Trace functional, 


Tr(A; Ao ái An) = Tr(A2 a A, A), A; E Mn (C) 
and use this to show directly that su(7) is closed under commutators. 


The reason we’ve singled out these peculiar-seeming properties is that they will 
turn out to be important, and it turns out that all Lie algebras of matrix Lie groups 
enjoy them: 


Proposition 4.2. Let g be the Lie algebra of a matrix Lie group G. Then g satisfies 
the following: 


1. g is a real vector space 
2. g is closed under commutators 
3. All elements of g obey the Jacobi identity. 


Proof sketch. Proving this turns out to be somewhat technical, so we’ll just sketch 
a proof here and refer you to Hall [11] for the details. Let g be the Lie algebra of 
a matrix Lie group G, and let X,Y € g. We’d first like to show that X + Y € g. 
Since g is closed under real scalar multiplication (why?), proving X + Y € g will 
be enough to show that g is a real vector space. The proof of this hinges on the 
following identity, known as the Lie Product Formula, which we state but don’t 
prove: 


eXtY — lim (eren) 
m—>oo 

This formula should be thought of as expressing the addition operation in g in terms 
of the product operation in G. With this in hand, we note that A,, = (emem)™ 
is a convergent sequence, and that every term in the sequence is in G since it is a 
product of elements in G. Furthermore, the limit matrix A = e*t” is in GL(n, C) 
by (4.54). By the definition of a matrix Lie group, then, A = e**” € G, and thus g 
is areal vector space. 


The second task is to show that g is closed under commutators. First, we claim 
that for any X € gl(n,C) and A € GL(n,C), 


exa — Ae* A, (4.63) 


You can easily verify this by expanding the power series on both sides. This implies 
that if X € g and A € G, then AXA™! € gas well, since 


e(AXA™ _ getX Ale GVIER. 
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Now let A = e’”, Y € g. Then e” Xe” e€ g Yt, and we can compute the 
derivative of this expression at t = 0: 


d atv yet = Yel’ Xe? = eY Xe tly =i 
dt ME = = 
= YX — XY. (4.64) 
Since we also have 
d hY —hY 
Sexe | = limf = id 
dt =p h->0 h 


and the right side is always in g since g is a vector space,” this shows that 
YX —XY =[Y, X] ising. 

The third and final task would be to verify the Jacobi identity, but as we pointed 
out above this holds for any set of matrices, and can be easily verified by direct 
computation. This completes the proof sketch. Oo 


Before moving on to the next section, we should discuss the significance of the 
commutator. We proved above that all Lie algebras are closed under commutators, 
but so what? Why is this property worth singling out? Well, it turns out that 
the algebraic structure of the commutator on g is closely related to the algebraic 
structure of the product on G. This is most clearly manifested in the Baker- 
Campbell-Hausdorff (BCH) formula, which for X and Y sufficiently small*° 
expresses e* e” as a single exponential: 


pet ater 5IXY]+ GIXIXYI- SIX -.. (4.66) 
It can be shown?! that the series in the exponent converges for such X and Y, 
and that the series consists entirely of iterated commutators, so that the exponent 
really is an element of g (if the exponent had a term like XY in it, this would 
not be the case since matrix Lie algebras are in general not closed under ordinary 
matrix multiplication). Thus, the BCH formula plays a role analagous to that of 
the Lie product formula, but in the other direction: while the Lie product formula 


°To be rigorous, we also need to note that g is closed in the topological sense, but this can be 
regarded as a technicality. 


30The size of a matrix X € M, (C) is usually expressed by the Hilbert-Schmidt norm, defined as 


XI] = So Xy. (4.65) 


ij=l 


If we introduce the basis {£;;} of M,,(C) from Example 2.9, then we can identify M,,(C) with 
Cc”, and then (4.65) is just the standard Hermitian inner product. 


31See Hall [11] for a nice discussion and Varadarajan [20] for a complete proof. 
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expresses Lie algebra addition in terms of group multiplication, the BCH formula 
expresses group multiplication in terms of the commutator on the Lie algebra. The 
BCH formula thus tells us that 


Much of the group structure of G is encoded in the commutator on g. 


You will calculate the first few terms of the exponential in the BCH formula in 
Problem 4-11. 


4.7 The Lie Algebras of Classical and Quantum Physics 


We are now ready for a detailed look at the Lie algebras of the matrix Lie groups 
we discussed in Sect. 4.3. 


Example 4.30. so(2) 


As discussed in Example 4.26, so(2) consists of all antisymmetric 2 x 2 matrices. 
All such matrices are of the form 

0 —a 

a 0 


and so $0(2) is one-dimensional and we may take as a basis 


asa) 


You will explicitly compute that 


ee al ae pen (4.67) 
sin cos 


so that X really does generate counterclockwise rotations in the x — y plane. Recall 
from Sect. 4.1 that X also generates rotations in the sense that for any position vector 
r, Xr is just the direction in which r would change under rotation. This means that 
X induces a vector field X*, given at an arbitrary point r by 


X4(r) = Xr. 


This vector field is depicted in Fig. 4.8. 


Exercise 4.26. Verify (4.67) by explicitly summing the power series for e°”. 
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Fig. 4.8 The vector field hc ah a _ 
X'(r) = Xr induced by the 


50(2) element X. This is one ) 4 ae ae ee aN 
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Example 4.31. s0(3) 


As remarked in Example 4.26, the matrices A;; form a basis for o(n). We now 
specialize to n = 3 and rename the Aj; as 


00 0 0 01 0-10 
L,.={00-1], L,=[ 000], L,=[1 00 (4.68) 
01 0 —-100 000 


which you may recognize from Sects. 4.1 and 4.5. Note that we can write the 
components of all of these matrices in one equation as 


(Li) jk = —€ijk, (4.69) 


an expression which will prove useful later. An arbitrary element X € s0(3) looks 
like 


0 =z y 
X =| z 0 =x | = xL; + yLy + zLz (4.70) 
-y x 0 


Note that dim s0(3) = 3, which is also the number of parameters in SO(3). 
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You can check, as we did in Exercise 4.2, that the commutators of the basis 
elements work out to be 


or, if we label the generators with numbers instead of letters, 
3 
[Li, Lj] = So ijn Le. (4.71) 
k=1 


(We will frequently change notation in this way when it proves convenient; 
hopefully, this causes no confusion). These are, of course, the well-known angular 
momentum commutation relations of quantum mechanics. The relation between the 
Lie algebra of so(3) and the usual angular momentum operators (2.17) will be 
explained in the next chapter. Note that if we defined Hermitian s0(3) generators 
by L; = i L;, the commutation relations would become 


3 
[L;, L] = > teal, 
k=1 


which is the form found in most quantum mechanics texts. 
We can think of the generator X in (4.70) as generating a rotation about the axis 
[Xk = (x, y, z), as follows: for any v = (vx, Vy, vz) € R3, you can check that 


0 =z y Ux 
Xv=ļ| z 0 =x vy | = (x,y,z) X (Ux, Vy, Uz) = [X] x v 
-yx 0 Vz 


so that if v lies along the axis [X], then Xv = 0. This then implies that 
e'*[X] = [X] 


(why?), so e* must be a rotation about [X] (since it leaves [X] fixed). In fact, if we 
take [X] to be a unit vector and rename it ñ, and also rename ż as 0, you will show 
below that 


n2(1—cos 0) + cos 6 nyny(1 — cos 6) —nzsin@ nyn,(1 — cos 6) + ny sin 0 
ek = nynx(1 — cos 0) + nz sin 0 n —cos@)+cos@ nyn-(1 — cos 0) — nx sind 


nznx(1—cos 0) — ny sin 0 nzny(1 — cos 0) + nx sind n2(1 — cos 0) + cos 8 


(4.72) 


4.7 The Lie Algebras of Classical and Quantum Physics 161 


proving our claim from Example 4.14 that this matrix represents a rotation about ñ 
by an angle 0. 

The astute reader may recall that we already made a connection between 
antisymmetric 3 x 3 matrices and rotations back in Example 3.30. There we saw 
that if A(t) was a time-dependent orthogonal matrix representing the rotation of a 
rigid body, then the associated angular velocity bivector (in the space frame) was 
lõ] = a4 A) If we let A(t) be a one-parameter subgroup A(t) = e’* generated 
by some X € so0(3), then the associated angular velocity bivector is 


x dA(t) ._, 
= A 
[ø] Ti 
= Xe*e™* 


=X. (4.73) 


Thus: 


The angular velocity bivector is just a rotation generator. 


(Recall that we gave a more heuristic demonstration of this fact back in Example 4.3 
at the beginning of this chapter, which may be worth re-reading now that you’ ve seen 
a more precise treatment.) Furthermore, applying the J map from Example 3.10 to 
both sides of (4.73) gives 


[o] = [X] 


and so the pseudovector w is just the rotation generator expressed in coordinates [X]. 
Note that this agrees with our discussion above, where we found that [X] gave the 
axis of rotation. 


Exercise 4.27. Let X € s0(3) be given by 
0 =n, ny 


X= nz, 0 =n, 
—ny ny 0 


where ñ = (nx, ny, nz) is a unit vector. Verify (4.72) by explicitly summing the power 
series for e°”. 


Exercise 4.28. Using the basis B = {L,, Ly, Lz} for s0(3), show that 
(LX, Y]]s = [XIs x [Y]g, 


which shows that in components, the commutator is given by the usual cross product on R?. 

Note: On the left-hand side of the above equation the two sets of brackets have entirely 
different meanings. The inner bracket is the Lie bracket, while the outer bracket denotes the 
component representation of an element of the Lie algebra. 
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Example 4.32. su(2) 


We already met su(2), the set of 2 x 2 traceless anti-Hermitian matrices, in 
Example 4.22. We took as a basis 


1/0 -i 1/o-1 1 f-i0 
Sy Pa > = 7 ; z= > . 
i (L7) i (07) S: o) 


Notice again that the number of parameters in the matrix Lie group SU (2) is equal to 
the dimension of its Lie algebra su(2). You can check that the commutation relation 
for these basis vectors is 


3 
[S;, Sj] = X eije Sk 


k=1 


which is the same as the s0(3) commutation relations! Does that mean that su(2) 
and sṣo(3) are, in some sense, the same? And is this in some way related to the 
homomorphism from SU(2) to SO(3) that we discussed in Example 4.22? The 
answer to both these questions is yes, as we will discuss in the next few sections. 
We will also see that, just as X = xLx + yLy + zL: € s0(3) can be interpreted as 
generating a rotation about (x, y, z) € R?, so can 


Y =xS, + yS2 + 283 = 5 (Oe iy) An m] € su(2). 


In fact, you can show that 


oon! Si ue —in,sin(@/2) (—inx —ny) sin(@/2) ) (4.74) 
~ N Ginx + ny) sin(6/2) cos(6/2) + in, sin(@/2) )’ f 


which we claimed earlier implements a rotation by angle 0 around axis h = 
(nx, ny, nz). To prove this, see Exercise 4.29 and Problem 4-7. 


Exercise 4.29. Let n = (n!, n?, n>) be a unit vector. Prove (4.74) by direct calculation, 
and use that to show that 


"Si = cos(0/2) I + 2sin(0/2) n Si. 
You will use this formula in Problem 4-7. 
Example 4.33. s0(3, 1) 


From Example 4.28, we know that an arbitrary X € s0(3, 1) can be written as 


1 
a 3 
a 0 


with X’ € s0(3) and a € R°. Embedding the L; of s0(3) into s0(3, 1) as 
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and defining new generators 


0001 0000 0000 
i= 0000 k= 0001 ee 0000 
0000 0000 0001 
1000 0100 0010 


we have the following commutation relations (Exercise 4.30): 


(Li, L;] = X eitlar (4.75a) 
k=1 

Kj] =} eye Ke (4.75b) 

[Ki, Kj] --) éijk Lx. (4.75c) 


As we mentioned in Example 4.28, the K; can be interpreted as generating boosts 
along their corresponding axes. In fact, you will show in Exercise 4.30 that for 
ueR, 


10 0 0 
uk; _ |01 0 eee (4.76) 
00 coshu —sinhu 


0 0 —sinhu coshu 


which we know from Example 4.17 represents a boost of speed 6 = v/c = tanh u 
in the z-direction. We will comment on the appearance of the rapidity u here in 
Box 4.5 below. 

The commutation relations in (4.75) bear interpretation. Equation (4.75a) is 
of course just the usual s0(3) commutation relations. Equation (4.75b) says that 
the K; transform like a vector under rotations [note the similarity with (3.57)]. 
Finally, (4.75c) says that if we perform two successive boosts in different directions, 
the order matters. Furthermore, the difference between performing them in one 
order and the other is actually a rotation! This then implies, by the Baker-Campbell— 
Hausdorff formula (4.66), that the product of two finite boosts is not necessarily 
another boost, as mentioned in Example 4.17. 
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Exercise 4.30. Verify the commutation relations (4.75), and verify (4.76) by explicitly 
summing the exponential power series. For a further challenge, sum the exponential power 


series for e” * to get (4.33), which represents a boost in the direction u. 


Box 4.5 The Addition of Rapidities and the Einstein Velocity Addition Law 
The fact that the rapidity u appears in the exponent in (4.76) shows that u is 
a kind of “boost angle” which provides a natural measure for the magnitude 
of a boost. In fact, just as the angles of rotation about a single axis add, so do 
boosts in a given direction: letting K = K3 and invoking the addition property 
of exponents*” we have 


e" K pink 2 etn tu)K (4.77) 


which says that a boost by rapidity u2 followed by a boost by uw (all in the 
z-direction) is equivalent to a boost by u + u2. Of course, the same is not 
true for boost velocities. For these, we invoke the addition law for hyperbolic 
tangents, 


tanh uv; + tanh u2 


tanh = , 
anh(u; + u2) 1 + tanh u; tanh u2 


(4.78) 


which combined with the substitutions 6)4. = tanh(u; + u2) and 6; = tanh u; 
becomes 


Bi + Bo 
1+ Bi po 
This, of course, is just Einstein’s relativistic velocity addition law; we see here 


that it follows from the additivity of rapidities, and that it is essentially just the 
addition law for hyperbolic tangents! 


Bi+2 = 


Exercise 4.31. 
Derive (4.78) by substituting (4.76) into (4.77). 


Example 4.34. s{(2,C)p 


sI(2, C)p is defined to be the Lie algebra of SL(2, C), viewed as a real vector space. 
Since SL(2, C) is just the set of all 2 x 2 complex matrices with unit determinant, 
sl(2, C)p is just the set of all traceless 2 x 2 complex matrices, and thus could be 


32Which, of course, only holds for matrices when the matrices being added commute. 
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viewed as a complex vector space, though we won’t take that point of view here.** 
A (real) basis for s{(2, C)p is 


1/0 =i 1 /0-1 L740 
s= Ga) s=) s=;(0 
ne. oi 5 c= a 7 eee D è 

2\10 2\i 0 2 (0-1 


Note that K; = i S;. This fact simplifies certain computations, such as the ones you 
will perform in checking that these generators satisfy the following commutation 
relations: 


k= 
~ 2 ~ 
[S:, K;] = X eijt Ke 
k= 
S 3 
[Ki K;] = -9 eije Si 
k=1 


These are identical to the so(3, 1) commutation relations! As in the case of su(2) and 
s0(3), this is intimately related to the homomorphism from SL(2,C) to SO(3, 1), 
that we described in Example 4.23. This will be discussed in Sect. 4.9. 


Exercise 4.32. Check that the K; generate boosts, as you would expect, by explicitly 
calculating e” £: to get (4.36). 


4.8 Abstract Lie Algebras 


So far we have considered Lie algebras associated with matrix Lie groups, and 
we sketched proofs that these sets are real vector spaces which are closed under 
commutators. As in the case of abstract vector spaces and groups, however, we can 


33The space of all traceless 2 x 2 complex matrices viewed as a complex vector space is denoted 
sI(2, C) without the R subscript. In this case, the S; suffice to form a basis. In this text, however, 
we will usually take Lie algebras to be real vector spaces, even if they naturally form complex 
vector spaces as well. You should be aware, though, that st(2, C) is fundamental in the theory of 
complex Lie algebras, and so in the math literature the Lie algebra of SL(2,C) is almost always 
considered to be the complex vector space s{(2, C) rather than the real Lie algebra sI(2, C). We 
will have more to say about s{(2, C) in the appendix. 


166 4 Groups, Lie Groups, and Lie Algebras 


now turn around and use these properties to define abstract Lie algebras. This will 
clarify the nature of the Lie algebras we’ve already met, as well as permit discussion 
of other examples relevant for physics. 

That said, a (real, abstract) Lie algebra is defined to be a real vector space 
g equipped with a bilinear map [-,-] : g x g — g called the Lie bracket which 
satisfies 


1. [X,Y] = —[Y, X] Y X,Y €g (Antisymmetry) 
2. [X,Y], Z] + [Y, Z], X] + [[Z, X], Y] =0 YX,Y,Z €g (Jacobi identity) 


By construction, all Lie algebras of matrix Lie groups satisfy this definition 
(when we take the bracket to be the commutator), and we will see that it is 
precisely the above properties of the commutator that make those Lie algebras 
useful in applications. Furthermore, there are some (abstract) Lie algebras that arise 
in physics for which the bracket is not a commutator, and which are not usually 
associated with a matrix Lie group; this definition allows us to include those algebras 
in our discussion. We’ll meet a few of these algebras below, but first we consider 
two basic examples. 


Example 4.35. gl(V) The Lie algebra of linear operators on a vector space 


Let V be a (possibly infinite-dimensional) vector space. We can turn L(V), the set 
of all linear operators on V, into a Lie algebra by taking the Lie bracket to be the 
commutator, i.e. 


[T,U]=TU -UT T,U € £(V). 


Note that this is a commutator of operators, not matrices, though of course there 
is a nice correspondence between the two when V is finite-dimensional and we 
introduce a basis. This Lie bracket is obviously anti-symmetric and can be seen to 
obey the Jacobi identity, so it turns £(V) into a Lie algebra which we’ll denote by 
gl(V).*4 We’ll have more to say about gI(V) as we progress. 


Example 4.36. isom(V) The Lie algebra of anti-Hermitian operators 


Consider the setup of the previous example, except now let V be an inner product 
space. For any T € L(V), the inner product on V allows us [via (4.13)] to define 
its adjoint TÌ, and we can then define isom(V) C gl(V) to be the set of all anti- 
Hermitian operators, i.e. those which satisfy 


Tt=-T. (4.79) 


4There is a subtlety here: the vector space underlying gl(V) is of course just L(V), so the 
difference between the two is just that one comes equipped with a Lie bracket, and the other is 
considered as a vector space with no additional structure. 
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You can easily verify that isom(V) is a Lie subalgebra of gI(V ). (A Lie subalgebra 
of a Lie algebra g is a vector subspace h C g that is also closed under the Lie 
bracket, and hence forms a Lie algebra itself). 

This definition is very reminiscent of the characterization of u(n) that we gave 
in Example 4.27; in fact, if V is complex n-dimensional and we introduce an 
orthonormal basis, then isom(V) = u(n)! The reason we introduce isom(V) here 
as an abstract Lie algebra is that in the infinite-dimensional case we cannot view 
isom(V) as the Lie algebra associated to a matrix Lie group, because we haven’t 
developed any theory for infinite-dimensional matrices. Nonetheless, isom(V) 
should be thought of as a coordinate-free, infinite-dimensional analog of u(n), and 
we’ll find that it plays a central role in quantum mechanics, as we’ll see in the next 
example. 


Example 4.37. The Poisson bracket on phase space 


Consider a physical system with a 2n dimensional phase space P parametrized by n 
generalized coordinates q; and the n conjugate momenta p;. The set of all complex- 
valued, infinitely differentiable?” functions on P is a real vector space which we’ll 
denote by C(P). We can turn C(P) into a Lie algebra using the Poisson bracket as 
our Lie bracket, where the Poisson bracket is defined by 


_ yo of dg dg of 
hehe 2 Ogi Əpi ðqi Opi PESER] 


The anti-symmetry of the Poisson bracket is clear, and the Jacobi identity can be 
verified directly by a brute-force calculation. 

The functions in C(P) are known as observables, and the Poisson bracket thus 
turns the set of observables into one huge* Lie algebra. The standard (or canonical) 
quantization prescription, as developed by Dirac, Heisenberg, and the other founders 
of quantum mechanics, is to then interpret this Lie algebra of observables as a 
Lie subalgebra of isom(H) for some Hilbert space H (this identification is known 
as a Lie algebra representation, which is the subject of the next chapter). The 
commutator of the observables in isom(#.) is then just given by the Poisson bracket 
of the corresponding functions in C (P). Thus the set of all observables in quantum 
mechanics forms a Lie algebra, which is one of our main reasons for studying Lie 
algebras here. 

Though C(P) is in general infinite-dimensional, it often has interesting finite- 
dimensional Lie subalgebras. For instance, if P = R and the q; are just the usual 


35A function is “infinitely differentiable” if it can be differentiated an arbitrary number of times. 
Besides the step function and its derivative, the Dirac delta “function”, most functions that one 
meets in classical physics and quantum mechanics are infinitely differentiable. This includes the 
exponential and trigonometric functions, as well as any other function that permits a power series 
expansion. 


36By this we mean infinite-dimensional, and usually requiring a basis that cannot be indexed by 
the integers but rather must be labeled by elements of R or some other continuous set. You should 
recall from Sect. 3.7 that L7(IR) was another such “huge” vector space. 
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cartesian coordinates for R?, we can consider the three components of the angular 
momentum, 


Ji = q2P3— 93P2 
J = q3Pı — qı P3 (4.80) 
J3 = qı p2 — QP, 


all of which are in C(R°). You will check below that the Poisson bracket of these 
functions turns out to be 


3 
i Ji} = Do eye de, (4.81) 
k=1 


which are of course the familiar angular momentum (s0(3)) commutation relations; 
as mentioned above, this is in fact where the angular momentum commutation 
relations come from! This then implies that 0(3) is a Lie subalgebra of C (R$). You 
may be wondering, however, why the angular momentum commutation relations 
are the same as the so(3) commutation relations. What do the functions J; have to 
do with generators of rotations? The answer has to do with a general relationship 
between symmetries and conserved quantities, which we summarize as follows 
(note: the following discussion is rather dense, and can be omitted on a first reading). 

Consider a classical system (i.e., a phase space P together with a Hamiltonian 
H e C(P)) which has a matrix Lie group G of canonical transformations?” acting 
on it. If H is invariant’? under the action of G, then G is said to be a group of 
symmetries of the system. In this case, one can then show’? that for every X € g 
there is a function fy € C(P) which is constant along particle trajectories in P 
(where trajectories, of course, are given by solutions to Hamilton’s equations). This 
fact is known as Noether’s Theorem, and it tells us that every element X of the Lie 
algebra gives a conserved quantity fy. Furthermore, the Poisson bracket between 
two such functions fy and fy is given just by the function associated with the Lie 
bracket of the corresponding elements of g, i.e. 


(fx, fr} = fixy] (4.82) 


If G = SO(3) acting on P by rotations, then it turns out“? that 


37That is, one-to-one and onto transformations which preserve the form of Hamilton’s equations. 
See Goldstein [8]. 


8Let Ø, : P — P be the transformation of P corresponding to the group element g € G. Then 
H is invariant under G if H(¢g(p)) = H(p)V pE P, g €G. 


3°Under some mild assumptions. See Cannas [4] or Arnold [1], for example. 


40See Arnold [1] for a discussion of Noether’s theorem and a derivation of an equivalent formula. 
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Fx (qi, Di) = (S(Qi. pi) |[X]), X € 50(3) 


where J = (Jj, J2, J3) is the usual angular momentum vector with components 
given by (4.80) and (-|-) is the usual inner product on R?. In particular, for 
X = L; € s0(3) we have 


fi; = J| (Li) = Jle) = J. 


Thus, the conserved quantity associated with rotations about the ith axis is just 
the ith component of angular momentum. This is the connection between angular 
momentum and rotations, and from (4.82) we see that the J; must have the same 
commutation relations as the L;, which is of course what we found in (4.81). Oo 


Exercise 4.33. Verify (4.81). Also show that for a function F € C(R6) that depends only 
on the coordinates q;, 


OF 
{pj Fi) = - 
J aq; 
and 
oF oF OF 
; F i = l 2 = 
{J3, F(qi)} qı a q2 Iq J 


where ¢ is the azimuthal angle. We will interpret these results in the next chapter. 


If we have a one-dimensional system with position coordinate g and conjugate 
momentum p, then P = R? and C(P) contains another well-known Lie algebra: 
the Heisenberg algebra. 


Example 4.38. The Heisenberg algebra 


Define the Heisenberg algebra H to be the span of {q, p, 1} C C(R?), where 1 € H 
is just the constant function with value 1. The only nontrivial Poisson bracket is 
between p and q, which you can check is just 


{q, p} = 1. 


Apart from the factors of A (which we’ve dropped throughout the text) and i (which 
is just an artifact of the physicist’s definition of a Lie algebra), this is the familiar 
commutation relation from quantum mechanics. H is clearly closed under the Lie 
(Poisson) bracket, and is thus a Lie subalgebra of C(R?). 

Can p and q be thought of as generators of specific transformations? Well, one of 
the most basic representations of p and q as operators is on the vector space L? (R), 
where 


G f(x) = xfx) 
df (4.83) 
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(again we drop the factor of i, in disagreement with the physicist’s convention). 
Note that [ĝ, p] = {q, p} = 1. If we exponentiate f, we find (see Exercise 4.34 
below) 


eÊ f(x) = fæ- t) 


so that p = -4 generates translations along the x-axis! It follows that {e’? | £ € R} 
is the one-parameter subgroup of all translations along the x-axis. What about ĝ? 
Well, if we work in the momentum representation of Example 3.17 so that we’re 
dealing with the Fourier transform ¢ (p) of f(x), you know from Exercise 3.19 that 
q is represented by i i. Multiplying by i gives iĝ = =% and thus 


e'îp(p) = b(p—t), 


which you can think of as a “translation in momentum space”. We will explain the 
extra factor of i in the next chapter. 

If we treat H as a complex vector space, then we can consider another common 
basis for it, which is {Q, P, 1} c C(R?) where 


_ ptiq 
oi 
p—i 
P= 
af 2i 
You can check that 
{0,P}=1 (4.84) 


which you may recognize from classical mechanics as the condition that Q and P 
be canonical variables (See Goldstein [8]). Q and P are well suited to the solution 
of the one-dimensional harmonic oscillator problem, and you may recognize their 
formal similarity to the raising and lowering operators a and a‘ employed in the 
quantum-mechanical version of the same problem. 

Q and P are not easily interpreted as generators of specific transformations, and 
our discussion of them helps explain why we defined abstract Lie algebras—so that 
we could work with spaces that behave like the Lie algebras of matrix Lie groups 
(in that they are vector spaces with a Lie bracket), but aren’t necessarily Lie algebras 
of matrix Lie groups themselves. 


Exercise 4.34. Show by exponentiating p = -4 that e’? f(x) is just the power series 
expansion for f(x — t). 


Exercise 4.35. Verify (4.84). 
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4.9 Homomorphism and Isomorphism Revisited 


In Sect. 4.4 we used the notion of a group homomorphism to make precise the 
relationship between SU(2) and SO(3), as well as SL(2,C) and SO(3, 1)o. Now 
we will define the corresponding notion for Lie algebras to make precise the 
relationship between su(2) and s0(3), as well as sl(2,C)p and so(3, 1). We will 
also show how these relationships between Lie algebras arise as a consequence of 
the relationships between the corresponding groups. 

That said, we define a Lie algebra homomorphism from a Lie algebra g to a Lie 
algebra þh to be a linear map ¢ : g — b that preserves the Lie bracket, in the sense 
that 


lX). oY) = oCUX.Y)) YX.Y €g. (4.85) 


If @ is a vector space isomorphism (which implies that g and h have the same 
dimension), then ¢@ is said to be a Lie algebra isomorphism. In this case, there 
is a one-to-one correspondence between g and h that preserves the bracket, so just 
as with group isomorphisms we consider g and h to be equivalent and write g ~ b. 

Sometimes the easiest way to prove that two Lie algebras g and h are isomorphic 
is with an astute choice of bases. Let {X;};=1..n and {Y;};=1..n be bases for g and b 
respectively. Then the commutation relations take the form 


[Xi Xj] = > cy Xk 
k=1 


[Y;, Yj] = Xod; Y 
k=1 


where the numbers c, J“ and d; j K are known as the structure constants of g and b. 
(This is a bit of a misnomer, though, since the structure constants depend on a choice 
of basis, and are not as inherent a feature of the algebra as the name implies.) If one 
can exhibit bases such that c; j k> d; j k Vi, j,k, then it’s easy to check that the map 


$: g>b 


v Xi > v' Ý; 


is a Lie algebra isomorphism. We will use this below to show that so(3) ~ su(2) 
and so(3, 1) ~ sl(2, C)r. 


Box 4.6 Structure Constants as Tensor Components 


The notation c, © suggests that the structure constants are the components of a 
tensor, and indeed this is the case. Define a (2,1) tensor T on a Lie algebra g by 
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T(X,Y, f)= f(X.Y) YX,Y €g, f €g". 
Then, letting { f*};=1..» be the basis dual to {X; }—1..n, T has components 
Ty = f X XD = f*G,'X) = cy" 


as expected. Note that T is essentially just the Lie bracket, except we have to 
feed the vector that the bracket produces to a dual vector to get a number. 


Example 4.39. gl(V) and gl(n, C) 


Let V be an n-dimensional vector space over a set of scalars C. If we choose a basis 
for V, we can define a map from the Lie algebra gl(V) of linear operators on V to 
the matrix Lie algebra gl(n, C) by T +> [T]. You can easily check that this is a Lie 
algebra isomorphism, so gl(V) ~ gl(n, C). If V is an inner product space, we can 
restrict this isomorphism to isom(V) C gl(V) to get 


isom(V) ~ o(n) ifC =R 
isom(V) ~ u(n) if C =C. 


Example 4.40. The ad homomorphism 


Let g be a Lie algebra. Recall from Example 2.15 that we can use the bracket to 
turn X € g into a linear operator by sticking X in one argument of the bracket and 
leaving the other open, as in [X,-]. One can easily check that [X, -] is a linear map 
from g to g, and we denote this linear operator by ady. We thus have 


ady(Y) = [X,Y] X,Y eg. 


Note that we have turned X into a linear operator on the very space in which it lives. 
Furthermore, in the case where g is a Lie algebra of linear operators (or matrices), 
the operator ady is then a linear operator on a space of linear operators! This idea 
was already introduced in the context of linear operators back in Example 2.15. Our 
reason for introducing this construction here is that it actually defines a linear map 
(check!) 


ad: g —> gl(g) 


X |> ady 


between two Lie algebras. Is this map a Lie algebra homomorphism? It is if 
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ady.y) = [ady, ady] (4.86) 


[notice that the bracket on the left is taken in g and on the right in gl(g)]. You will 
verify (4.86) in Exercise 4.36 below, where you will find that it is equivalent to 
the Jacobi identity! In fact, this is one way of interpreting the Jacobi identity—it 
guarantees that ad is a Lie algebra homomorphism for any Lie algebra g. 

We will see in the next chapter that the ad homomorphism occurs frequently in 
physics, and we’ll find in Chap. 6 that it’s also crucial to a proper understanding of 
spherical tensors. More mathematically, the ad homomorphism is also fundamental 
in the beautiful structure theory of Lie algebras; see Hall [11]. Oo 


Exercise 4.36. Verify that (4.86) is equivalent to the Jacobi identity. 


Before we get to more physical examples, we need to explain how a continuous 
homomorphism from a matrix Lie group G to a matrix Lie group H leads to a 
Lie algebra homomorphism from g to §. This is accomplished by the following 
proposition: 


Proposition 4.3. Let ® : G —> H be a continuous homomorphism from a 
matrix Lie group G to a matrix Lie group H. Then this induces a Lie algebra 
homomorphism ġ : g — b given by 


_d tX 
$(X) = = e (4.87) 


Proof heuristics: Before diving into the proof, let’s get a geometric sense of what 
this proposition is saying and why it should be true. Recall from Sect. 4.5 that we 
can think of g as the tangent plane to G at the identity e; Proposition 4.3 then says 
a Lie group homomorphism ® : G — H induces a map from the tangent space at 
the identity in G to that of H . Furthermore, if we recall that a tangent vector to the 
identity in G (i.e., X € g) is associated with one-parameter subgroup y(t) = e*, 
then the corresponding one-parameter subgroup in H is just (® o y)(t) = ®(e'*), 
Taking the derivative of this curve yields @(X), just as stated in (4.87), and we then 
have the essential relation 


oe) = e't (4.88) 


which we will prove below. These heuristics are illustrated in Fig. 4.9. 


Proof of Proposition 4.3. This proof is a little long, but is a nice application of all 
that we’ve been discussing. The idea is first to check that (4.87) defines an element 
of þh, and then check that @ really is a Lie algebra homomorphism by checking that 
it’s linear and preserves the Lie bracket. 


Let © : G — H be a homomorphism satisfying the hypotheses of the 
proposition, and let {e’* } be a one-parameter subgroup in G. It’s easy to see (check!) 
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Fig. 4.9 Schematic representation of the Lie algebra homomorphism ¢ : g — 6 induced by a Lie 
group homomorphism ® : G —> H. The linear map ¢ should be pictured as a map between the 
tangent planes at the identity of G and H. The vector #(X) is just given by the tangent vector to 
(eX) at e’ 


that {(e’*)} is a one-parameter subgroup of H, and hence by the discussion in 
Sect. 4.5 there must be a Z € b such that ®(e’*) = e'7. We can thus define our 
map ¢ from g to h by ø (X) = Z. We then have 


d 
P Pel _,=Z 


or equivalently 
P(e”) = fF, (4.89) 


as suggested above. In addition to thinking of @ as a map between tangent planes, 
we can also roughly think of ġ as the “infinitesimal” version of ®, with @ taking 
“infinitesimal” transformations in g to “infinitesimal” transformations in b. 

Is ġ a Lie algebra homomorphism? There are several things to check. First, we 
must check that ¢ is linear, which we can prove by checking that (sX) = sọ (X) 
and o(X + Y) = $(X) + (Y). To check that (sX) = s(X), we can just 
differentiate using the chain rule: 


(sX) = 1 oer) 


dt 1=0 

d(st) d tX 
= (e's 

a d aes ee 
=sZ 


= sh(X). 
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Checking that o (X +Y) = $(X)+¢(Y) is a little more involved and involves a 
bit of calculation. The idea behind this calculation is to use the Lie product formula 
to express the addition in g in terms of the group product in G, then use the fact that 
® is a homomorphism to express this in terms of the product in H, and then use the 
Lie product formula in reverse to express this in terms of addition in h. We have: 


o(X+Y)= L ae) 


t=0 


d a 
= — 0/( lim (emem)”) 
m—->oo 


by Lie Product formula 


d 1X Y : . ; 
= — lim © ((e mem ym) since ® is continuous 
dt m—>oo 


t=0 


d z i m 
= — lim (@(e" ace )) since ® is ahomomorphism 
dt m>oo 


t=0 
to(X) tY)” 
==, lim (e me J by (4.88) 
t m—>oo t=0 

d 1O0O+0) i 
=e by Lie product formula 

dt 1=0 
=$(X) + o(Y). 


So we’ve established that @ is a linear map from g to h. Does it preserve the 
bracket? Yes, but proving this also requires a bit of calculation. The idea behind this 
calculation is just to express everything in terms of one-parameter subgroups and 
then use the fact that ® is a homomorphism. You can skip this calculation on a first 
reading, if desired. We have 


(04). GO =H ep NVC by (4.64) 
=< P(e pY) P(e) by (4.88) 
a Pe) (5 a) O(e"*) 


a I O(e* ) be" P(e") by (4.88) 


dt ds 
dd 
=— — P(e ese) since ® is a homomorphism 
dt ds 
d d tX p,—tX 
=S (se'* Ye ) by (4. 
uate ) a 
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d 
= o (e'* Ye*) by definition of ¢ 
d xy X ; 
=o P7 e“ Ye by Exercise 4.37 below 
=¢ ([X, Y)) by (4.64) 
and so ¢ is a Lie algebra homomorphism, and the proof is complete. Oo 


Exercise 4.37. Let ġ be a linear map from a finite-dimensional vector space V to a finite- 
dimensional vector space W, and let y(t) : R —> V be a differentiable V -valued function 
of ¢ (you can think of this as a path in V parametrized by rf). Show from the definition of a 
derivative that 


d d 
=o) = 6 (570) vr. (4.90) 


Now let’s make all this concrete by considering some examples. 
Example 4.41. SU(2) and SO(3) revisited 


Recall from Example 4.22 that we have a homomorphism p : SU(2) > SO(3) 
defined by the equation 


[AXA']g = p(A)[X]z (4.91) 


where X € su(2), A € SU(2), and B = {S,,S,,S,}. The induced Lie algebra 
homomorphism ¢ is given by 


d 
$Y) = Pe i= 
and you will show below that this gives 
(Si) = Li i= 1,2,3. (4.92) 


This means that ¢ is one-to-one and onto, and since the commutation relations (i.e., 
structure constants) are the same for the S; and L;, we can then conclude from our 
earlier discussion that ¢ is a Lie algebra isomorphism, and thus su(2) ~ so0(3). You 
may have already known or guessed this, but the important thing to keep in mind 
is that this Lie algebra isomorphism actually stems from the group homomorphism 
between SU(2) and SO(3). 


Exercise 4.38. Calculate £ ple! Si)|,<9 and verify (4.92). Hint: First, heed the warning of 
Box 4.7 below. Then, use the definition of p given in (4.91), let X be arbitrary, and plug 
in A = eñi, You'll have to do the calculation separately for i = 1,2,3. Also, it may save 
time to compute £ (etsi Xe—'5') symbolically (using the product rule) before plugging in 
coordinate expressions for the various matrices. 
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Box 4.7 A Word About Calculating Induced Lie Algebra Homomorphisms 

A word of warning is required here. In calculating induced Lie algebra 
homomorphisms via the definition (4.87), one may be tempted to move the 
derivative 4, “through” the group homomorphism 9, and thus calculate ¢ (X) 
as (Lex ). This may even seem justified by (4.90). Note, however, that 
the expression (Ze) is nonsensical, since the argument (when evaluated 
at t = 0) lives in the Lie algebra g, whereas the domain of ® is the Lie 
group G. This suggests that (4.90) may not generally apply to Lie group 
homomorphisms; indeed, a key assumption in its derivation was that the map 
in question be a linear map between vector spaces, rather than a more general 
Lie group homomorphism. 


Example 4.42. SL(2, C) and SO(3, 1), revisited 


Just as in the last example, we’ll now examine the Lie algebra isomorphism between 
s((2,C)p and so(3,1) that arises from the homomorphism pọ : SL(2,C) > 
SO(3, 1)o. Recall that p was defined by 

[AXA']s = p(A)[X]s 


where A € SL(2,C), X € Ho(C), and B = {o,,0,,0,,1}. The induced Lie 
algebra homomorphism is given again by 


d 
oY) = Spe No 
which, as you will again show, yields 


(S) = Li 


r (4.93) 


Thus ¢ is one-to-one, onto, and preserves the bracket (since the L ; and K; have the 
same structure constants as the S; and K;); thus, s{(2,C)p ~ so0(3, 1). Oo 


Exercise 4.39. Verify (4.93). 


What is the moral of the story from the previous two examples? How should one 
think about these groups and their relationships? Well, the homomorphisms p allows 
us to interpret any A € SU(2) as a rotation and any A € SL(2,C) as a restricted 
Lorentz transformation. As we mentioned before, though, p is two-to-one, and in 
fact A and —A in SU(2) correspond to the same rotation in SO(3), and likewise 
for SL(2, C). However, A and —A are not “close” to each other; for instance, if we 
consider an infinitesimal transformation A = J + €X, we have —A = —I] — €X, 
which is not close to the identity (though it is close to —/). Thus, the fact that p is 
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not one-to-one cannot be discerned by examining the neighborhood around a given 
matrix; one has to look at the global structure of the group for that. So one might say 
that locally, SU(2) and SO(3) are identical, but globally they differ. In particular, 
they are identical when one looks at elements near the identity, which is why their 
Lie algebras are isomorphic. The same comments hold for SL(2, C) and SO(3, 1)o. 

One important fact to take away from this is that the correspondence between 
matrix Lie groups and Lie algebras is not one-to-one; two different matrix Lie 
groups might have isomorphic Lie algebras. Thus, if we start with a Lie algebra, 
there is no way to associate to it a unique matrix Lie group. This fact will have 
important implications in the next chapter. 


Example 4.43. The Ad and ad homomorphisms 


You may have found it curious that su(2) was involved in the group homomorphism 
between SU(2) and SO(3). This is no accident, and Example 4.41 is actually an 
instance of a much more general construction which we now describe. Consider a 
matrix Lie group G and its Lie algebra g. We know that for any A € G and X € g, 
AXA! is also in g, so we can actually define a linear operator Ad4 on g by 


Ad4(X) = AXA, Xeg. 


We can think of Ad; as the linear operator which takes a matrix X and applies the 
similarity transformation corresponding to A, as if A was implementing a change of 
basis. This actually allows us to define a group homomorphism 


Ad: G > GL(g) 
At Ady, 
where you should quickly verify that Ad4Adg = Adyg. Since Ad is a homo- 
morphism between the two matrix Lie groups G and GL(g), we can consider the 


induced Lie algebra homomorphism ¢ : g —> gl(g). What does ¢ look like? Well, if 
X €g, then ġ (X) € gl(g) is the linear operator given by 


d 
O(X) = Fy Adex 


t=0 


To figure out what this is, exactly, we evaluate the right-hand side on Y € g: 


$X) 


d 
= Adex (¥)) 


t=0 


£ (e'¥Ye™*) 


t=0 


[X,Y] 
adx (Y) 
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so @ is nothing but the ad homomorphism of Example 4.40! Thus ad is the 
“infinitesimal” version of Ad, and the commutator is the infinitesimal version 
of the similarity transformation. 

Note also that since Ad is a homomorphism, Ad,+x is a one-parameter subgroup 
in GL(g), and its derivative at t = 0 is ady. From (4.88) we then have 


Ad,ix = ef °x (4.94) 


as an equality in GL(g). In other words, 


2 3 
e'¥Ye™* =Y +t[X,Y] + ZIX, [X, Y]] + alk. [X. [X, Y]]] +.... (4.95) 


It is a nice exercise to expand the left-hand side of this equation as a power series 
and verify the equality; this is Problem 4-13. E 


Exercise 4.40. Let X, H € M, (C). Use (4.94) to show that 
[X, H] =0 4> e*He* =H_VtER. 


If we think of H as a quantum-mechanical Hamiltonian, this shows how the invariance 
properties of the Hamiltonian (like invariance under rotations R) can be formulated in terms 
of commutators with the corresponding generators. 


To make the connection between all this and Example 4.41, we note that usually 
Ady will preserve a metric on g (known as the Killing Form K; see Problem 4-12), 
and thus 


Ad: G —> Isom(g). 


In the case of G = SU(2) above, g = su(2) is three-dimensional and K is positive- 
definite, so! 


Isom(su(2)) ~ O(3), 


and thus Ad : SU(2) — O(3). You can check that this map is identical to 
the homomorphism p described in Example 4.22, and so we actually have Ad : 
SU(2) — SO(3)! Thus the homomorphism between SU(2) and SO(3) is nothing 
but the Adjoint map of SU(2), where Ad(g), g € SU(2) is orthogonal with 
respect to the Killing form on su(2). We also have the corresponding Lie algebra 
homomorphism ad : su(2) — s0(3), and we know that this must be equal to ¢ from 
Example 4.41; thus, [ads,]a = $(S;) = Li. 


41Of course, this identification depends on a choice of basis, which was made when we chose to 
work with B = {S1, S2, S3}. 
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Exercise 4.41. Using [S;, S;] = yi 1 Eijk Sk, compute the matrix representation of ads, 
in the basis B and verify explicitly that [ads,]z = L;. 


Example 4.44. The relationship between determinant and trace 


We saw in Exercise 4.16 that det : GL(n,C) — C* is a group homomorphism, 
and in fact it is a continuous homomorphism between matrix Lie groups (recall 
that C*~GL(1, C)). Proposition 4.3 then tells us that det induces a Lie algebra 
homomorphism ¢@ : gl(n,C) — C. What is ø? In Problem 4-14 you will prove 
(by direct calculation) the remarkable fact that ¢ is nothing but the trace functional. 
We can then apply (4.88) with £ = 1 to immediately obtain 


dete* = e™ VX egl(n,C), 


a result which we proved only for diagonalizable matrices back in Proposition 4.6. 


Chapter 4 Problems 


Note: Problems marked with an “x” tend to be longer, and/or more difficult, and/or 
more geared towards the completion of proofs and the tying up of loose ends. 
Though these problems are still worthwhile, they can be skipped on a first reading. 


4-1. (*) In this problem we show that SO(n) can be characterized as the set of 
all linear operators which take orthonormal (ON) bases into orthonormal 
bases and can be obtained continuously from the identity. 


(a) The easy part. Show that if a linear operator R takes ON bases into 
ON bases and is continuously obtainable from the identity, then R € 
SO(n). It’s immediate that R € O(n); the trick here is showing that 
det R = 1. 

The converse. If R € SO(n), then it’s immediate that R takes ON 
bases into ON bases. The slightly nontrivial part is showing that R is 
continuously obtainable from the identity. Prove this using induction, 
as follows: First, show that the claim is trivially true for SO(1). Then 
suppose that the claim is true for n — 1. Take R € SO(n) and 
show that it can be continuously connected (via orthogonal similarity 
transformations) to a matrix of the form 


1 , 
( me R' € SO(n—1). 


The claim then follows since by hypotheses R’ can be continuously 
connected to the identity. (Hint: You’ll need the 2-D rotation which 
takes e; into Re).) 


(b 


wm 
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4-2. In this problem we prove Euler’s theorem that any R € SO(3) has an 
eigenvector with eigenvalue 1. This means that all vectors v proportional to 
this eigenvector are invariant under R, i.e. Rv = v, and so R fixes a line in 
space, known as the axis of rotation. 


(a) Show that À being an eigenvalue of R is equivalent to det(R—AJ/) = 0. 
Refer to Problem 3-5 if necessary. 
(b) Prove Euler’s theorem by showing that 


det(R — T) = 0. 


Do this using the orthogonality condition and properties of the deter- 
minant. You should not have to work in components. 


4-3. Show that the matrix (4.24) is just the component form (in the standard 
basis) of the linear operator 


R(ñ, 0) = L(h) & ñ + cos 8 (I — L(W) @ ñ) + sin 0 û x 


where (as you should recall) L(v)(w) = (v|w), and the last term eats 
a vector v and spits out sin 0 ñ x v. Show that the first term is just the 
projection onto the axis of rotation ñ, and that the second and third terms 
just give a counterclockwise rotation by @ in the plane perpendicular to ñ. 
Convince yourself that this is exactly what a rotation about n should do. 


4-4. (x) Show that SO(3,1), is a subgroup of O(3,1). Remember that 
SO(3, 1), is defined by 3 conditions: |A| = 1, A44 > 1, and (4.18). Proceed 
as follows: 


(a) Show that J € SO(3, 1)v. 
(b) Show that if A € $O(3.1),, then A~! € SO(3, 1). Do this as follows: 


(i) Verify that |A~!| = 1 
(ii) Show that A`! satisfies (4.18). Use this to deduce that A’ does 
also. 
(iii) Write out the 44 component of (4.18) for both A and A7!. You 
should get equations of the form 


a =1+a 
(4.96) 
b =1+b. 


where bọ = (A~!)44. Clearly this implies bọ < —1 or bọ > 1. 
Now, write out the 44 component of the equation AA~! = J. You 


should find 


aobo = l-a-b. 
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If we let a = |a|, b = |b| then the last equation implies 
1—ab < aobo < 1 + ab. (4.97) 


Assume by < —1 and use (4.96) to derive a Sona ton to (4.97), 
hence showing that bọ = (A7 !)44 > 1, and that A~ 1 € SO(3, 1). 


(c) Show that if A, B € SO(3,1),, then AB € SO(3, 1)». You may have 
to do some inequality manipulating to show that (AB)44 > 0. 


4-5. (x) In this problem we prove that any A € SO(3, 1)» can be written as a 
product of a rotation and a boost. 


(a) If A is not a pure rotation, then there is some relative velocity B 
between the standard reference frame and the new one described by A. 
Use the method of Exercise 4.13 to find £ in terms of the components 
of A. 

(b) Let L be the pure boost of the form (4.31) corresponding to the B you 
found above. Show that L~! 4 is a rotation, by calculating that 


(LA)aa = 1 
(LC Aia = (Ale =O, i= 1,2,5, 


II 


Conclude that A = L(L~!A) is the desired decomposition. 


4-6. (*) In this problem we show that any A € SL(2,C) can be decomposed 
as A = LU where U € SU(2) and L is of the form (4.36). Unfortunately, 
it’s a little too much work to prove this from scratch, so we’ll start with 
the polar decomposition theorem, which states that any A € SL(2,C) can 
be decomposed as A = HU where U € SU(2) and H is Hermitian and 
positive, which means that (v| Hv) > 0 for all nonzero v € C? (here (-| -) 
is the standard Hermitian inner product on C”). The polar decomposition 
theorem can be thought of as a higher-dimensional analog of the polar form 
of a complex number, z = re’®. You'll show below that the set of positive, 
Hermitian H € SL(2,C) is exactly the set of matrices of the form (4.36). 
This, combined with the polar decomposition theorem, yields the desired 
result. For more on the polar decomposition theorem itself, including a 
proof, see Hall [11]. 


(a) Show that an arbitrary Hermitian H € SL(2, C) can be written as 


a+b z 
H = 
( Z vp) 


where 


a,beR,zeC, @ -b -|z =1. (4.98) 
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(b) Show that any numbers a, b, z satisfying (4.98) can be written in the 
form 


a = +coshu 
b = v, sinh u 


z = (vx — ivy) sinh u 


for some u € R and unit vector v € R°. Then show that positivity 
requires that a = + cosh u. This puts H in the form (4.36). 

It remains to be shown that an H of the form (4.36) is actually positive. 
To do this, we employ a theorem (see Hoffman and Kunze [13]) which 
states that a matrix B is positive if and only if it is Hermitian and all 
its principal minors are positive, where its principal minors A;(B), 
k <n are the k x k partial determinants defined by 


(c 


wm 


Show that both the principal minors A\(H) and A2(#) are positive, 
so that H is positive. 


4-7. (*) In this problem we find an explicit formula for the map p : SU(2) > 
O(3) of Example 4.22 and use it to prove that p maps SU(2) onto SO(3) 
and has kernel +7. 


(a) Take an arbitrary SU(2) matrix A of the form (4.25) and calculate 
AXA’ for an arbitrary X € su(2) of the form (4.40). 

(b) Decomposing @ and £ into their real and imaginary parts, use (a) to 
compute the column vector [AX A‘]. 

(c) Use b) to compute the 3 x 3 matrix p(A). Recall that p(A) is defined 
by the equation p(A)[X] = [AXA]. 

(d) Parametrize a and 6 as 


a = eil¥t9)/2 cos e 
2 
B = id 0-9/2 sin 


as in (4.26). Substitute this into your expression for p(A) and show that 
this is the transpose (or inverse, by the orthogonality relation) of (4.23). 
(e) Parametrize «œ and £ as in (4.74) by 


a = cos(@/2) — in, sin(@/2) 
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b = (-in, — ny) sin(@/2) 


and substitute this into your expression for (A) to get (4.24). 

(£) Conclude that det p(A) = 1 and that p maps SU(2) onto SO(3). Also 
use your expression from (c) to show that the kernel of p is +/. (It’s 
obvious that +7 € K. What takes a little calculation is showing that 
+I is all of K.) 


4-8.(«) In this problem we find an explicit formula for the map p : SL(2,C) > 
O(3, 1) of Example 4.23, and use it to prove that p maps SL(2,C) onto 
SO(3, 1) and has kernel +/. Note that since any A € SL(2,C) can be 
decomposed into A = LR with R € SU(2) and L of the form (4.36), and 
any B € SO(3,1), can be decomposed as B = LR’ where L is of the 
form (4.33) and R’ € SO(3) C SO(3, 1)o, our task will be complete if we 
can show that p(L) = L. We’ll do this as follows: 


(a) To simplify computation, write the matrix (4.36) as 


L= ) abe R,zeEC 
Zz a-b 


~ (“ +b z 
and calculate LX L* for X € H(C) of the form (4.44). 

(b) Decomposing z into its real and imaginary parts, compute the column 
vector [LX Ī']. 

(c) Use (b) to compute the 4 x 4 matrix p(L). 

(d) Substitute back in the original expressions for a, b, and z 


a = coshu/2 


b 


£ sinhu/2 

u 
1 ; : 

z = —(uy —iuy) sinh u/2 
F j 


into p(L) to obtain (4.33). 


4-9. (x) In this problem we’ll prove the claim from Example 4.25 that, 
for a given permutation o € S,, the number of transpositions in any 
decomposition of o is either always odd or always even. 


(a) Consider the polynomial 


P(X, Xn) = [ [@ = xj). 


i<j 


For example, for n = 3 and n = 4 this gives 
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D(X1, x2, X3) = (x1 — x2)(x1 — X3)(X2 — x3) 
P(X, X2, X3, X4) = (xı — xX2)(x1 — x3)(x1 — x4) X 


(x2 — x3) (x2 — x4) (x3 — x4). 


Define an action of o € S, on p by 


(op)(X1,°°° Xn) = P(X): , Xo(n)) = [eo — Xo(j)). 


taj 


Convince yourself that op = +p. 

Let t € S, be a transposition. Prove (or at least convince yourself) that 
Tp = —p. 

Now assume that o has a decomposition into an even number of trans- 
positions. Use p to prove that o can then never have a decomposition 
into an odd number of transpositions. Use the same logic to show that 
if o has a decomposition into an odd number of transpositions, then all 
of its decompositions must have an odd number of transpositions. 


(b 


wm 


(c 


wm 


4-10.(*) Consider the subset H C GL(n,C) consisting of those matrices 
whose entries are real and rational. Show that H is in fact a subgroup, and 
construct a sequence of matrices in H that converge to an invertible matrix 
with irrational entries (there are many ways to do this!). This shows that H 
is a subgroup of GL (n, C) which is not a matrix Lie group. 


4-11.In this problem we’ll calculate the first few terms in the Baker-Campbell-— 
Hausdorff formula (4.66). 

Let G be a Lie group and let X, Y € g. Suppose that X and Y are small, 

so that e*e” is close to the identity and hence has a logarithm computable 
by the power series for In, 


2 (I-X) 


k=1 


By explicitly expanding out the relevant power series, show that up to third 
order in X and Y, 


In(e*eY) = X +Y 4 ix, Y] + Bara [X, Y]] — lyy, [X,Y] +. 
2 12 12 

Note that one would actually have to compute higher order terms to verify 

that they can be written as commutators, and this gets very tedious. A more 

sophisticated proof is needed to show that every term in the series is an 

iterated commutator and hence an element of g. See Varadarajan [20] for 

such a proof. 
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4-12. Let G be a matrix Lie group. Its Lie algebra g comes equipped with 
a symmetric (2,0) tensor known as its Killing Form, denoted K and 
defined by 


K(X, Y) = —Tr(adyady). 
(a) Show that K is Ad-invariant, in the sense that 
K(Ad4(X), Ad4(Y)) = K(X, Y) VX,Y eg, AEG. 


(b) You know from Exercise 4.41 that [ads,] = L;. Use this to compute the 
components of K in the {S;} basis and prove that 


[K] = 21. 


Thus, K is positive definite. This means that K is an inner product on 
su(2), and so from part (a) we conclude that Ady € Isom(su(2)) ~ 
O(3). 


4-13. Prove directly that 
Adax = ef 34x (4.99) 


by induction, as follows: first verify that the terms first order in ¢ on either 
side are equal. Then, assume that the nth order terms are equal (where n 
is an arbitrary integer), and use this to prove that the n + 1th order terms 
are equal. Induction then shows that the terms of every order are equal, and 
so (4.99) is proven. 


4-14. In Example 4.44 we claimed that the trace functional Tr is the Lie algebra 
homomorphism ¢ induced by the determinant function, when the latter is 
considered as a homomorphism 


det : GL(n, C) > C* 
At> det A. 


This problem asks you to prove this claim. 
Begin with the definition (4.87) of @, which in this case says 


_d 1X 
p(X) = g eE )lz=0. 


To show that #(X) = TrX, expand the exponential above to first order in f, 
plug into the determinant using the formula (3.72), and expand this to first 
order in ź using properties of the determinant. This should yield the desired 
result. 


Chapter 5 
Basic Representation Theory 


Now that we are familiar with groups and, in particular, the various transformation 
groups (i.e., matrix Lie groups) that arise in physics, we are ready to look at objects 
that “transform” in specific ways under the action of these groups. The notion of an 
object “transforming” in a specific way is made precise by the mathematical notion 
of a representation, which is essentially just a way of representing the elements of 
a group or Lie algebra as operators on a vector space; the objects which “transform” 
are then just elements of the vector space. 

Representations are important in both classical and quantum physics. In classical 
physics, they clarify what we mean by a particular object’s “transformation proper- 
ties.” In quantum mechanics representations actually provide the basic mathematical 
framework, since the Lie algebra of observables g acts on the Hilbert space at 
hand, making it a representation of g. Furthermore, representation theory clarifies 
the notion of “vector” and “tensor” operators, which are usually introduced in a 
somewhat ad-hoc way (much as we did in Sect. 3.7!). Finally, representation theory 
allows us to easily and generally prove many of the quantum-mechanical “selection 
rules” that are so handy in computation. 

As is now our custom, we begin this chapter with some heuristics which 
hopefully motivate the basic definitions of representation theory, as well as the 
questions it seeks to answer. 


5.1 Invitation: Symmetry Groups and Quantum Mechanics 


The basic notions of representation theory are very natural, perhaps even obvious. 
Nonetheless, they can be very helpful in organizing our thinking about the plethora 
of linear operators and corresponding vector spaces that arise in physics, particularly 
in quantum theory. In this section we’ll show how the notion of a represen- 
tation arises very naturally in quantum mechanics, and is in fact the essence 
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of the canonical quantization prescription. We’ll also consider the plurality of 
representations of a given symmetry, how that plurality depends on the symmetry 
under consideration, and in particular we’ll revisit various forms of the angular 
momentum operators and see how representation theory might help us make sense 
of them. 

The definition of a representation arises very naturally if we consider symmetry 
transformations in quantum mechanics. As we saw in the last chapter, the basic 
symmetries of space and spacetime (translations, rotations, Lorentz transformations, 
parity, etc.) are mathematically embodied in groups acting on R? or R4. For 
a quantum-mechanical system with Hilbert space H, there should then be 
some corresponding action of these groups on H. Furthermore, this action 
should be via unitary operators (i.e., isometries; cf. Sect.4.2). This is because 
quantum-mechanical states are represented by unit vectors, and so any symmetry 
transformation which takes a state into another state must preserve norms. This 
implies (as can be shown via an argument similar to the one used in Exercise 4.21) 
that such a transformation must be unitary. 

Let G be our symmetry group. Then there should be a map 


II : G > Isom(H), 


where for any g € G, II(g) implements the transformation g on our quantum- 
mechanical system. It is only natural to require that the implementation of gı 
followed by the implementation of g) should be the same as implementing g2g}. 
This means that we must have 


TI(g2) TI (g1) = M(g291). 


This, of course, just says that IT must be a group homomorphism! Note that IT is 
of a restricted class of homomorphisms, in that it maps G into a group of linear 
operators. This motivates the definition of a representation of a group G as a group 
homomorphism 


II:G—GL(V) for some vector space V. 


If, as is often the case, V is an inner product space and II : G > Isom(V), then the 
representation IT is said to be unitary. 

If G is a matrix Lie group, then Proposition 4.3 tells us that any! representation 
I : G —> GL(V) induces a Lie algebra homomorphism m : g —> gl(V). Such 
homomorphisms are similarly known as Lie algebra representations. If V is an 
inner product space and IT is unitary, then by the argument leading to (4.55) x 
will map g into the space isom(V) of anti-Hermitian operators (cf. Example 4.36). 
In this case x is also said to be unitary. With this terminology we can say that 
any quantum-mechanical Hilbert space H should carry a unitary representation 
x : g — isom(H) of any physically relevant Lie algebra g. 


‘We ignore here the (very mild) requirement of continuity of II. For more on this see Hall [11], 
Sect. 1.6. 
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This raises some obvious and important questions. For a given g, do any such 
representations exist? Are they unique? If not, how many are there? These are some 
of the questions representation theory seeks to answer. 

Unsurprisingly, the answer depends very much on g. Consider first the Lie 
algebra C(P) of observables for a classical system with phase space P, discussed 
in Example 4.37. We mentioned there that the canonical prescription for quantizing 
a classical system is to interpret the elements of C(P) as anti-Hermitian operators 
on some Hilbert space H, where the commutation relation between operators is just 
given by the Poisson bracket of the corresponding functions in C (P). This, however, 
is just the definition of a unitary Lie algebra representation! We can thus reformulate 
the standard quantization prescription as: 

The canonical quantization of a classical system with phase space P con- 


sists of finding a unitary representation of its Lie algebra of observables 
C(P). 


The canonical quantization prescription is thus just a statement about 
representations. 

The question remains, though, about the existence and uniqueness of repre- 
sentations of C(P). To answer this, let’s consider the simplest case where our 
physical system has just one-dimension with coordinate q, along with conjugate 
momentum p, so that P = R?. Let’s further restrict ourselves to the Heisenberg 
algebra H C C(R?), introduced in Example 4.38. Any representation of C(R7) 
will induce a representation of H, so we content ourselves with asking whether 
H has any unitary representations.” This question is answered by the celebrated 
Stone-von Neumann theorem, which says that (up to a change of basis and under 
certain technical assumptions), there is only one unitary representation z of H. 
This, then, must be given by the familiar g and f operators acting on V = L?(R), 
as in (4.83). We can thus rest easy because the one representation of H that we 
know is essentially the only game in town! 

Of course, the Heisenberg algebra H is not the only Lie algebra of physical 
interest. While ĝ and p in H generate translations in momentum and position space 
(as discussed in Example 4.38), there are other symmetries, such as rotations, that 
should also be represented on H. To a certain degree this happens automatically: 
you may recall from Example 4.37 that in three spatial dimensions s0(3) C C(P), 
so that the usual representation of C(P) on L? (R°) induces an s0(3) representation 
on the same space. This corresponds to orbital angular momentum, and the form 
of the representation is given by (5.3) below. But, we also know that particles 
carry an additional “spin” degree of freedom, unrelated to their spatial degrees 
of freedom, which form additional representations of s0(3) that must be “tensor- 
producted” with L? (R°). The “theory of angular momentum,” which is really just 
so(3) representation theory in disguise, says that the only relevant possibilities are 


Extending such representations to all of C(P) is possible in some senses, leading to deformation 
quantization and geometric quantization, but such topics are far outside the scope of this text. 
A standard reference for geometric quantization is Woodhouse [22], and an introduction to 
deformation quantization can be found in Zachos [23]. 
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the spin s representations, described in Example 3.20.* We have, however, already 
met several other incarnations of the so(3) generators L;. Are these related to the 
spin s representations, and if so, how? 

Let us revisit these various versions of the so(3) generators L;, viewing them 
properly as representations x : s0(3) —> gl(V). First, consider the s0(3) ~ su(2) 
representation given by the su(2) matrices S; from Example 4.32. These matrices 
obey the so(3) commutation relations, so we can take V = C? and define 


1 —i 1 — 1 (-i 
mides(9f). ares(l Gy). meoes(Ge). on 


This is, of course, the familiar s = 1/2 representation. 
A slightly less straightforward example is given by taking V = R? and x : 
s0(3) —> gI(R?) to be the identity map, so that 


00 0 0 01 
tL) =Ly= | 00-1], xa(L)=L,=| 0 00}, 
01 0 -100 


1 


0— 
m(L;)=L,=|1 0 (5.2) 
0 0 


ooo 


These matrices act on three-dimensional vectors, which are sometimes said to be 
“spin-one,” suggesting that this representation is related to the s = 1 representation 
on C?. That representation is complex, however, and usually features a diagonalized 
Lz. Nonetheless, it is straightforward to extend real representations to complex ones 
(as we'll do in Sect.5.10), at which point we can make a (complex) change of 
basis to diagonalize L,. This turns the matrices in (5.2) into the familiar s = 1 
representation, reproduced in (5.4) below. This change of basis should be familiar, 
actually; you already used it in Exercise 3.9 to diagonalize essentially the same 
version of L, as above! This suggests that representations which differ by only a 
change of basis should be considered “equivalent”; we will take this up in Sect. 5.6. 

As yet another example, recall from Sect.2.4 that on the vector space of 
polynomials on R3, we can represent the L; as 


ð əz 
i 0 d 
m(Ly) =i («7 — 5) (5.3) 


3We will prove this result ourselves in Sect. 5.9. Also, note the contrast between s0(3) and C(P); 
C(P) has essentially only one unitary representation, whereas so(3) has an infinite number! 
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It is straightforward to check that these satisfy the required commutation relations, 
and so this gives yet another representation of s0(3), in fact an infinite-dimensional 
one. How does this relate to the spin s representations? 

To see the connection, let’s restrict the differential operators in (5.3) to just the 
degree | polynomials and represent them in the / = 1 spherical harmonic basis (cf. 
Example 2.16 and Exercise 2.11). This yields 


, (91 0 010 
Lo = —=]10-1], a= | -10-1 ], 
v2 lo- 0 Zoio 
—i 00 
[x(L:)] =| 0 00 (5.4) 
0 0i 


which is just the familiar s = 1 representation. Thus the / = 1 spherical harmonic 
representation of s0(3) must be equivalent to the standard spin-one representation! 
It turns out, unsurprisingly, that this is in fact true for all / € N; we’ll prove this in 
Sect. 5.9. Furthermore, this suggests that infinite-dimensional representations like 
the space of polynomials on R? may “decompose” into simpler representations 
which are easier to describe. Section 5.7 describes this decomposition process, as 
well as the notion of irreducibility which defines what we mean by a “simple” 
representation. It will turn out that the irreducible representations of so0(3) are 
(up to equivalence) the spin s representations. Furthermore, any finite-dimensional 
representation (and some infinite-dimensional representations too!) of s0(3) can be 
decomposed into a collection of irreducible representations. (Such decomposable 
collections often arise as tensor products of simpler representations; we’ll study 
this in Sect. 5.4.) Thus, virtually any s0(3) representation can, with perhaps a 
little work decomposing and changing bases, be viewed as a collection of spin s 
representations. Analogous results hold for s0(3, 1), which we’ ll prove in Sect. 5.11. 


With this motivation we now proceed to the precise definitions, as well as a 
wealth of examples. 


Box 5.1 Internal Symmetries 

Before moving on, it’s worth pointing out that the spin degree of freedom 
discussed above is just one example of what are known as “internal” degrees 
of freedom. These are unrelated to spatial degrees of freedom, and manifest as 
additional Hilbert spaces Hinternat Which one must tensor product with L? (R”) 
(in n spatial dimensions). Since Hinternal is unrelated to physical space, the 
symmetry groups that act on it need not be related to the symmetries of space- 
time. Nature indeed takes this liberty, leading to the SU(2) “isospin,” SU(3) 
“flavor”, and SU(3) “color” symmetries, with corresponding representations 
on Hinternal- These symmetries and their representations are much more relevant 


192 5 Basic Representation Theory 


for particle physics and quantum field theory than for the single-particle 
quantum mechanics we focus on, but the basic representation theory we present 
here is a necessary prelude to those more advanced topics. An excellent 
reference for SU(3) representation theory is Hall [11]. 


5.2 Representations: Definitions and Basic Examples 


In the previous section we gave a preliminary definition of a Lie algebra represen- 
tation as simply a Lie algebra homomorphism where the target space (or range) is 
the Lie algebra gl(V) of linear operators on a vector space V. Similarly, a group 
representation was simply a group homomorphism where the target space was a 
group of linear operators on V. In both cases, we should think of the resulting 
operators as “representing” the elements of our group or Lie algebra. With these 
basic ideas in mind, we now give the precise definitions. 

A representation of a group G is a vector space V together with a group 
homomorphism II : G + GL(V). Sometimes they are written as a pair (II, V), 
though occasionally when the homomorphism II is understood we’l just talk about 
V, which is known as the representation space. If V is a real vector space, then we 
say that (II, V) is a real representation, and similarly if V is a complex vector 
space. If G is a matrix Lie group, and V is finite-dimensional, and the group 
homomorphism II : G — GL(V) is continuous, then II induces a Lie algebra 
homomorphism z : g — gl(V) by (4.87). Any homomorphism from g to g((V) 
for some V is known as a Lie algebra representation, so every finite-dimensional 
representation of a Lie group G induces a representation of the corresponding Lie 
algebra g. The converse is not true, however; not every representation of g comes 
from a corresponding representation of G. This is intimately connected with the fact 
that for a given Lie algebra g, there is no unique matrix Lie group G that one can 
associate with it. We’ll discuss this in detail in the case of su(2) and s0(3, 1) later. 

In many of our physical applications the vector space V will come equipped with 
an inner product (-|-) which is preserved by the operators II,, or in other words 
II : G > Isom(V). In this case, we say that TI is a unitary representation, since 
each I1(g) will be a unitary operator (cf. Example 4.8). The induced Lie algebra 
representation z then maps g into isom(V) (by the argument leading to (4.55), in 
which case zr is also referred to as unitary; this terminology applies to any such z : 
g — isom(V), regardless of whether it is induced by a unitary group representation. 

It is thus very natural to require symmetry generators and observables in quantum 
mechanics to be anti-Hermitian. There is, however, another reason for insisting 
on anti-Hermitian operators, which is that division by i then yields Hermitian 
operators, which are diagonalizable with real eigenvalues (cf. Box 4.4). These two 
requirements are logically independent, and there is no reason a priori to suppose 
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that they can be met simultaneously. Thus, it is very convenient that anti-Hermitian 
operators fulfill both! 


Box 5.2 Application to Infinite-Dimensions 

As with our discussion of tensors, we will treat mainly finite-dimensional vec- 
tor spaces here but will occasionally be interested in infinite-dimensional appli- 
cations. Rigorous treatment of these applications can be subtle and technical, 
though, so as before we will extend our results to certain infinite-dimensional 
cases without addressing the issues related to infinite-dimensionality. Again, 
you should be assured that all such applications are legitimate and can, in 
theory, be justified. 


Example 5.1. The trivial representation 


As the name suggests, this example will be somewhat trivial, though we will end up 
referring to it later. For any group G (matrix or discrete) and vector space V, define 
the trivial representation of G on V by 


I(g)=7 VgeG. 


You will verify below that this is a representation. Suppose in addition that G is 
a matrix Lie group. What is the Lie algebra representation induced by II? For all 
X € g we have 


4 (e) 


m(X) T 


so the trivial representation of a Lie algebra is given by 7(X) =0 YX €g. 


Exercise 5.1. Let our representation space be V = C. Show that GL(C) ~ GL(1, C) = 
C* where C* is the group of nonzero complex numbers. Then verify that II : G > C* as 
defined above is a group homomorphism, hence a representation. Also verify that 7 : g > 
gl(C) = C given by (X) = 0 Y X isa Lie algebra representation. 


Example 5.2. The fundamental representation II = Id 


Let G be a matrix Lie group. By definition, G is a subset of GL(n, C) = GL(C") 
for some n, so we can simply interpret the elements of G as operators (acting by 
matrix multiplication) on V = C”. This yields the fundamental (or standard) 
representation of G, in which case the group homomorphism II is just the identity 
map Id. 

If G = O(3) or SO(3), then V = R? and the fundamental representation 
is known as the vector representation. If G = SU(2), then V = C? and the 
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Table 5.1 Summary of the 
fundamental representation 
for various matrix Lie groups 


Fundamental representations 
Group G |V | Name 


G, including the SU(2) C? | Spinor 
representation space V and SO(3) R? | Vector 
common nomenclature SO(3,1) | Rf | Four-vector 


SL(2,C) |C? | Spinor (relativistic) 


fundamental representation is known as the spinor representation. If G = SO(3, 1), 
or O(3, 1), then V = R4, and the fundamental representation is also known as the 
vector (or sometimes four-vector) representation. If G = SL(2,C) ‚then V = C?, 
and the fundamental representation is also known as the spinor representation. 
Vectors in this last representation are sometimes referred to more specifically as left- 
handed spinors, and are used to describe massless relativistic spin 1/2 particles.* 
This is summarized in Table 5.1. 

Each of these group representations induces a representation of the correspond- 
ing Lie algebra which then goes by the same name, and which is also given just by 
interpreting the elements of g C gl(n,C) as linear operators. Since Lie algebras 
are vector spaces and a representation 7 is a linear map, we can describe any Lie 
algebra representation completely just by giving the image of the basis vectors under 
x (this is one of the nice features of Lie algebra representations; they are much easier 
to concretely visualize). Thus, the vector representation of s0(3) is given by 


00 0 
m(Ly,) =| 00-1 
01 0 
001 
m(Ly)={ 0 00 
—1 00 
0-10 
mL ={10 07, 
000 


where, again, z is just the identity. Likewise, the spinor representation of su(2) is 
given by 


4There is, of course, such a thing as a right-handed spinor as well, which we’ll meet in the next 
section and which is also used to describe massless spin 1/2 particles. The right- and left-handed 
spinors are known collectively as Weyl spinors, in contrast to the Dirac spinors, which are used to 
describe massive spin 1/2 particles. We shall discuss Dirac spinors towards the end of this chapter. 
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if eet 
"s9 = 5 (2 7.) 

1/o— 
ms =3(9 | 

1 (-i0 


and similarly for the vector representation of so(3, 1) and the spinor representation 
of sI(2, C)r. 


Exercise 5.2. Show that the fundamental representations of SO(3), O(3), and SU(2) are 
unitary. (The fundamental representations of SO(3, 1)», O(3,1), and SL(2,C) are not 
unitary, which can be guessed from the fact that the matrices in these groups are not unitary 
matrices. This stems from the fact that these groups preserve the Minkowski metric, which 
is not an inner product.) 


Example 5.3. The adjoint representation 


A less trivial class of examples is given by the Ad homomorphism of Example 4.43. 
Recall that Ad is a map from G to GL(g), where the operator Ad; (for A € G) is 
defined by 


Ad4(X) = AXA! Xeg. 


In the context of representation theory, the Ad homomorphism is known as the 
adjoint representation (Ad, g). Note that the vector space of the adjoint represen- 
tation is just the Lie algebra of G! The adjoint representation is thus quite a natural 
construction, and is ubiquitous in representation theory (and elsewhere!) for that 
reason. To get a handle on what the adjoint representation looks like for some of 
the groups we’ve been working with, we consider the corresponding Lie algebra 
representation (ad, g), which you will recall acts as 


ady(Y) = [X,Y] X.Y €g. 


For s0(3) with basis 6 = {L;};=1-3, you have already calculated in Exercise 4.41 
(using the isomorphic su(2) with basis {S;};=1—3) that 


[adz; |x = Li 


so for so(3) the adjoint representation and fundamental representation are identical! 
(Note that we didn’t have to choose a basis when describing the fundamental 
representation because the use of the standard basis there is implicit.) Does this 
mean that the adjoint representations of the corresponding groups SO(3) and O(3) 
are also identical to the vector representation? Not quite. The adjoint representation 
of SO(3) is identical to the vector representation (as we’ ll show), but that does not 
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carry over to O(3); for O(3), the inversion transformation —J acts as minus the 
identity in the vector representation, but in the adjoint representation acts as 


Ad_7(X) = ((DX(-D =X (5.5) 


so Ad_; is the identity! Thus the vector and adjoint representations of O(3), 
though similar, are not identical, and so the adjoint representation is known as the 
pseudovector representation. This will be discussed further in the next section. 

What about the adjoint representations of SU(2) and su(2)? Well, we already 
met these representations in Examples 4.22 and 4.41, and since su(2) ~ s0(3) and 
the adjoint representation of s0(3) is the vector representation, the adjoint represen- 
tations of both SU(2) and su(2) are also known as their vector representations. 

As for the adjoint representations of SO(3, 1), and O(3, 1), it is again useful to 
consider first the adjoint representation of their common Lie algebra, s0(3, 1). The 
vector space here is s0(3, 1) itself, which is six-dimensional and spanned by the 
basis 6 = fL; K; yi j=1-3. You will compute in Exercise 5.3 below that the matrix 
forms of adz, and adx, are (in 3 x 3 block matrix form) 


[adz] = G e) 


lady,] = ( as ) | 


From this we see that the L ; and K; both transform like vectors under rotations 
(adz, ), but are mixed under boosts (adx, ). This is reminiscent of the behavior of the 
electric and magnetic field vectors, and it turns out (as we’ll see in Sect. 5.5) that 
the action of s0(3, 1) acting on itself via the adjoint representation is identical to the 
action of Lorentz transformation generators on the antisymmetric field tensor Fv 
from Example 3.16. The adjoint representation for s0(3, 1) is thus also known as 
the antisymmetric 2nd rank tensor representation, as is the adjoint representation 
of the corresponding groups SO(3, 1), and O(3, 1). We omit a discussion of the 
adjoint representation of SL(2, C) for technical reasons.° E 


(5.6) 


Exercise 5.3. Verify Eq. (5.6). 


Table 5.2 below summarizes the representations we just discussed, along with 
the fundamental representations from Table 5.1. 


> Namely, that the vector space in question, ṣI(2, C)g, is usually regarded as a three-dimensional 
complex vector space in the literature, not as a six-dimensional real vector space (which is the 
viewpoint of interest for us), so to avoid confusion we omit this topic. This won’t affect any 
discussions of physical applications. 


5.3 Further Examples 197 


Table 5.2 Summary of the adjoint and fundamental representations for various 
matrix Lie groups G, including the representation spaces V and common 


nomenclature 
Adjoint Fundamental 
Group V =g | Name V | Name 
SO(3) s0(3) Vector R? | Vector 
0(3) so(3) Pseudovector R? | Vector 
SU(2) su(2) Vector C? | Spinor 


SO(3, 1), |s0(3,1) | Antisymmetric 2nd rank tensor | R* | Four-vector 
0(3, 1) 50(3, 1) | Antisymmetric 2nd rank tensor | R* | Four-vector 
SL(2,C) |— C? | Spinor (relativistic) 


5.3 Further Examples 


The fundamental and adjoint representations of a matrix Lie group are the most 
basic examples of representations and are the ones out of which most others can be 
built, as we’ ll see in the next section. There are, however, a few other representations 
of matrix Lie groups that you are probably already familiar with, as well as a 
few representations of abstract Lie algebras and discrete groups that are worth 
discussing. We’ll discuss these in this section, and return in the next section to 
developing the general theory. 


Example 5.4. Representations of Zo 


The notion of representations is useful not just for matrix Lie groups, but for more 
general groups as well. Consider the finite group Z2 = {1, —1}. For any vector space 
V, we can define the alternating representation (Tan, V) of Z2 by 


Tar (1) =I 
Tlar(—1) = I. 


This, along with the trivial representation, allows us to succinctly distinguish 
between the vector and pseudovector representations of O(3)°; when restricted to 
Z2 ~ {I,—I} C O(3), the fundamental (vector) representation of O(3) becomes 
Tat, whereas the adjoint (pseudovector) representation becomes T yivial- 

Another place where these representations of Z) crop up is the theory of identical 
particles. Recall from Example 4.25 that there is a homomorphism 


sgn: Sa > Zo 


6 And, as we’ll see, between the vector and pseudovector representations of O(3, 1). 
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which tells us whether a given permutation is even or odd. If we then compose 
this map with either of the two representations introduced above, we get two 
representations of Sa: Ila: o sgn, which is known as the sgn (read: “sign’’) 
representation of S,, and T¢iviaa © sgn, which is just the trivial representation 
of Sn. If we consider an n-particle system with Hilbert space 7,°(H), then from 
Example 4.25 we know that S”(H) C TPH) furnishes the trivial representation 
of S,, whereas A”H C TPH) furnishes the sgn representation of S,. This allows 
us to restate the symmetrization postulate once again, in its arguably most succinct 
form: 

Symmetrization Postulate III: For a system composed of n identical parti- 

cles, any state of the system lives either in the trivial representation of S,, (in 


which case the particles are known as bosons) or in the sgn representation of 
Sn (in which case the particles are known as fermions). 


Box 5.3 Proving the Symmetrization Postulate 

Though the symmetrization postulate, by name, implies that it is to be treated as 
an assumption, it is in fact deducible from the physically motivated requirement 
that any n-particle state be invariant (up to a phase) under particle interchange. 
More formally, we give the following proposition and proof: 


Proposition 5.1. Let Ho = 7,0(H) be the Hilbert space of a system with n 
identical particles, and let II : S, — GL(Htot) be a representation of Sn on 
Hio given by 


GI(o)) (v: @ V2 @ BV) = Vo(1) 8 Vol) B+ ++ 8 Vain) Vo € Sn, v EH. 
Then if Y € Hio satisfies 
I(o)(y)=cy Vo € Sy, with |c| = 1 (5.7) 


(where c may depend on o and yf), then y is an element of either S"(H) or 
AH. 


Proof. Equation (5.7) tells us that we can restrict II to Span{w} to get a 
complex one-dimensional representation 


Iy : Sn > GL(Span{y}) ~ C*. 


Furthermore, if we consider a transposition t € S,, we have t? = I which 
means 


y = (ty = °y = c=41 (5.8) 
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and so Iy : Sa — Z2. All we need to prove, then, is that the trivial and sgn 
representations are the only such representations of S,,. 

We proceed by contradiction. Assume that Iy : Sa — Zp is neither the 
trivial nor the sgn representation. Then there must exist transpositions t,, tT 
such that Iy (t+) = +1. Assume, without loss of generality, that t} = (12), 
where the notation (12) denotes the transposition that switches indices 1 and 2. 
Note that (12) = (21) = (12)7!. We write t- more generally as (ij), i,j € 
{1,..., n}. Then, letting p = (1i)(2j) € Sn, consider the permutation 


pt+p = (DDADL). 


By carefully tracing through the action of this permutation on the indices 
1,2,i,j, you should convince yourself that, in fact, ọpt+p™! = t_. But this 
implies 


-1 = Ty(t_) = Ty (ory 07") = My (o) My (t+) Hy (07) = Oy (t4) = +1. 


This contradiction completes the proof. Oo 


Example 5.5. The four-vector representation of SL(2,C) 


Recall from Example 4.23 that SL(2,C) acts on H>(C) by sending X > AXA’, 
where X € H>(C) and A € SL(2,C). It’s easy to see that this actually defines a 
representation (II, H(C)) given by 


II(A)(X) = AXA’. 


We already saw that if we take B = {o,,0,,0,,/} as a basis for H(C), then 
[TI(A)] € SO(3, 1), so in this basis the action of TI(A) looks like the action of 
restricted Lorentz transformations on four-vectors. Hence (I1, H2(C)) is also known 
as the four-vector representation of SL(2, C). 


Example 5.6. The right-handed spinor representation of SL(2, C) 


As one might expect, the right-handed spinor representation (I1,C*) of 
SL(2, C) is closely related to the fundamental (left-handed spinor) representation. 
It is defined simply by taking the adjoint inverse of the fundamental representation, 
that is 


Ū(A) v = At! v, A € SL(2,C), v € C. 
You can easily check that this defines a bona fide representation of SL(2, C). The 


usual four component Dirac spinor can be thought of as a kind of “sum” of a left- 
handed spinor and a right-handed one, as we’ll discuss later on. E 
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The next few examples are instances of a general class of representations that is 
worth describing briefly. Say we are given a finite-dimensional vector space V, a 
representation II of G on V, and a possibly infinite-dimensional vector space C(V ) 
of functions on V. (This could be, for instance, the set P;(V) of all polynomial 
functions of a fixed degree /, or the set of all infinitely differentiable complex- 
valued functions C(V), or the set of all square-integrable functions L?(V).) Then 
the representation IIT on V induces a representation TI on C(V) as follows: if 
f € C(V), then the function Tl, f € C(V) is just given by f o 1h bee or 


(ly fv) = fz) geG, fecV), ver. (5.9) 


There are a couple things to check. First, it must be verified that II gf is actually 
an element of C(V); this, of course, depends on the exact nature of C(V) and 
of II and must be checked independently for each example. Assuming this is 
true, we also need to verify that I(g) is a linear operator and that it satisfies 
Ii(gh) = TI(g)I1(h). This computation is the same for all such examples, and 
you will perform it in Exercise 5.4 below. 


Exercise 5.4. Confirm that if (TI, V) is a representation of G, then ñ g is a linear operator 
on C(V), and that Ñ (g1) Ñ (e2) = Ñ (g1 g2). What happens if you try to define IT using g 
instead of g7 !? 


Example 5.7. The spin s representation of su(2) and polynomials on C? 


We already know that the Hilbert space corresponding to a spin s particle fixed 
in space is C%™%+!, Since the spin angular momentum S is an observable for this 
system and the S; have the su(2) structure constants, C**t! must be a representation 
of su(2). How is this representation defined? Usually one answers this question 
by considering the eigenvalues of S, and showing that for any finite-dimensional 
representation V, the eigenvalues of S, lie between —s and s for some half-integral 
s. V is then defined to be the span of the eigenvectors of S, (from which we 
conclude that dim V = 2s + 1), and the action of Sy and S, is determined by the 
su(2) commutation relations. This construction is important and will be presented 
in Sect. 5.9, but it is also rather abstract; for the time being, we present an alternate 
construction which is more concrete. 

Consider the set of all degree / polynomials on C’, i.e. the set of all degree / 
polynomials in the two complex variables zı and z2. This is a complex vector space, 
denoted P; (C?), and has basis 


Bi = {dé 0 <k <BR = {z}, dy, ..., ad, d} 


and hence dimension / + 1. The fundamental representation of SU(2) on C? 
then induces an SU(2) representation (I1;, P; (C?)) as described above: given a 
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polynomial function p € P,(C*) and A € SU(2), T)(A)(p) is the degree / 
polynomial given by 


(Ap) O) = pA) ve. 
To make this representation concrete, consider the degree one polynomial 


p(z1;,z2) = zı. This polynomial function just picks off the first coordinate of 
any v € C’. Let 


so that 
A fe=P 
=(57): 
Then 
(11,(A)p)(v) = p(A~'v) 
but 
-1 _(& -P \ (a) — ( ei -bz 
Pe R 610 
so then 


TT, (A)zı = &zı — Bz. 
Likewise, (5.10) tells us that 
TT, (A)z2 = Bz + OZ. 


From this, the action of TI; (A) on higher order polynomials can be determined since 
T1(A)(e-Fek) = Aza) T (Adan). 

We can then consider the induced Lie algebra representation (7, P;(C?)), in 
which (as you will show) eS is an eigenvector of S, with eigenvalue i G — k), 
so that 


1/2 


I/2—=1 


[Mie = i (5.11) 


-1/2 
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If we let s = //2, then we recognize this as the usual form (up to that pesky factor 
of 7) of S, acting on the Hilbert space of a spin s particle. E 


Exercise 5.5. Use the definition of induced Lie algebra representations and the explicit 
form of e’*' in the fundamental representation [which can be deduced from (4.74)] to 
compute the action of the operators 7,(S;) on the functions zı and z2 in P; (C’). Show 
that we can write these operators in differential form as 


j a ð 
m (Sı) = 3 (e$ ta =) 


ne a-a (5.12) 
TE \O2 T2 22 ZI EF . 


Prove that these expressions also hold for the operators 7r; (S) on P; (CP). Verify the su(2) 
commutation relations directly from these expressions. Finally, use (5.12) to show that 


SDA) = i /2— kK) "h. 


Warning: This is a challenging exercise, but worthwhile. See the next example for an 
analogous calculation for $O(3). Also, don’t forget that for a Lie group representation TI, 
induced Lie algebra representation z, and Lie algebra element X, 


d d 
x(x) = Ile 1X) £ (5 ex) 
dt 
as discussed in Box 4.7. 


Example 5.8. L? (R?) as a representation of SO (3) 


Recall from Example 3.18 that L? (R?) is the set of all complex-valued square- 
integrable functions on R*, and is physically interpreted as the Hilbert space 
of a spinless particle moving in three-dimensions. As in the previous example, 
the fundamental representation of SO(3) on R? induces an SO(3) representation 
(II, L2(R3)) by 


RADE) = f(R'x) f € L?(R’), R € SOG), xe R. 


One can think of TI(R)f as just a “rotated”? version of the function f. This 
representation is unitary, as we’ll now digress for a moment to show. 

Let (-|-) denote the Hibert space inner product on L? (R°). To show that IT is 
unitary, we need to show that 


CICR) f |T(R)g) = (fg) Y f.g € LR’). 
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First off, we have 


(T(R) fITI(R)g) = | d’x (Dr f)(x) Mr) G) 
= ox f (Rx) g(R7'X). 


If we now change variables to x’ = R~!x and remember to include the Jacobian, 
we get 


d(x, y, z) 


URDE = f a SI Ae) 


= i dx! F(x!) g(x’) 
= (f lg), 


where you will verify below that the Jacobian determinant | 


oe 22 is equal to 
one. This should be no surprise; the Jacobian tells us how volumes change under a 
change of variables, but since in this case the change of variables is given just by a 
rotation (which we know preserves volumes), we should expect the Jacobian to be 
one. 

Though the definition of II might look strange, it is actually the action of 
rotations on position kets that we’re familiar with; if we act on the basis “vector” 
|xo) = 5(x — xo), we find that (for arbitrary y € L? (R?)) 


(Telo) = f dx FRx- x0) 
= pes W(Rx’)5(x’ — Xo) where we let x’ = R7!x 
= Y (Rxo) 
hence we must have 
TI r|xo) = |Rx0) 


which is the familiar action of rotations on position kets. 
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What does the corresponding representation of s0(3) look like?” As mentioned 
above, we can get a handle on that just by computing 7 (L;), since x is a linear map 
and any X € so(3) can just be expressed as a linear combination of the L;. Hence 


we compute: 
d a 
(m1, f(x) = Ti Mer f)(X)|,=0 by the definition of z 


d 
aber a (e*ix)| _, by the definition of TI 


dg d ; 
= > Lw Ji (ei) men by the multivariable chain rule 
? x 7 


3 9 
= wctixy 


j=l 


s9 zg OT" 


J: r? 
= 3 Eijk x* Y w by (4.69) 
jk=1 


so that, relabeling dummy indices, 


3 


0 
(Lj) = — 2 Eijk x aE 7 
jk=1 
More concretely, we have 
ð 
Ly) =z—-- y= 
m(Lx) 25 az 
(Ly) : : (5.13) 
n(Ly) = x—-z— . 
ae ax 
0 
m(L;) 2a ay 


7We should mention here that the infinite-dimensionality of L? (R?) makes a proper treatment of 
the induced Lie algebra representation quite subtle; for instance, we calculate in this example that 
the elements of s0(3) are to be represented by differential operators, yet not all functions in L? (R?) 
are differentiable! (Just think of a step function which is equal to 1 inside the unit sphere and 0 
outside the unit sphere; this function is not differentiable at r = 1.) In this example and elsewhere, 


we ignore such subtleties. 
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which, up to our usual factor of i, is just (2.17)! O 


Exercise 5.6. Verify that if x’ = R~!x for some rotation R € SO(3), then 


a(x, y,z) 
a(x, y’, z) 


=1. (5.14) 


Example 5.9. H)(R°), Hı and L?(S?) as representations of SO(3) 


Recall from Chap. 2 that H; (R3) is the vector space of all harmonic complex-valued 
degree / polynomials on R?. Since H; (R?) is a space of functions on R?, we get an 
SO(3) representation of the same form as in the last example, namely 


(11;(R)(f))(x) = (Rx) f € H, (R®), R € SOB), x ER’. 


You will check below that if f is a harmonic polynomial of degree /, then 
TI; (R) f is too, so that TI; really is a representation on H; (R*). The induced s0(3) 
representation also has the same form as in the previous example. 

If we now restrict all the functions in H,(R*) to the unit sphere, we get a 
representation of SO(3) on Hy, the space of spherical harmonics of degree 1. 
Concretely, we can describe this representation by writing Y(6,@) as Y (ñ), where 
N is a unit vector giving the point on the sphere which corresponds to (0, @). Then 
the SO(3) representation (IT, Hi) is given simply by 


(11,(R)Y)(@) = Y(R7'f). (5.15) 


The interesting thing about this representation is that it turns out to be unitary! The 
inner product in this case is just given by integration over the sphere with the usual 
area form, i.e. 


x 27 _ 
TEDEH = f f Yı (0, p)Y2(0, p) sin 0 do dé. 


Proving that (5.15) is unitary with respect to this inner product is straightforward 
but tedious, so we omit the calculation.’ 

One nice thing about this inner product, though, is that we can use it to define a 
notion of square-integrability just as we did for R and R°: we say that a C-valued 
function Y (0, ġ) on the sphere is square-integrable if 


T 2n 
(7(6,6)¥(6.4)) = i i 1¥(0,6)|? sin@ do d0 < oo. 


8Just as in the previous example, however, one can define transformed coordinates 6’ and ¢’ and 


(9.9) 
3(0’.9’) 


then unitarity hinges on the Jacobian determinant | being equal to one, which it is because 


rotations preserve area on the sphere. 
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Just as with square-integrable functions on R, the set of all such functions forms a 
Hilbert space, usually denoted as L?(S*), where S? denotes the (two-dimensional) 
unit sphere in R?. It’s easy to see” that each Hy c LS 2), and in fact it turns out that 
all the H; taken together are actually equal to L?(S*)! We’ll discuss this further in 
Sect. 5.7 , but for now we note that this implies that the set {Y} |0 <1 < œ, —l < 
m < l} of all the spherical harmonics form an (orthonormal) basis for L? (S°). This 
can be thought of as a consequence of the spectral theorems of functional analysis, 
crucial to quantum mechanics, which tell us that, under suitable hypotheses, the 
eigenfunctions of a self-adjoint linear operator (in this case the spherical laplacian 
A s2) form an orthonormal basis for the Hilbert space on which it acts. Oo 


Exercise 5.7. Let f € H,(R*). Convince yourself that TI g f is also a degree / polynomial, 
and then use the chain rule to show that it’s harmonic, hence an element of H; (R°). The 
orthogonality of R should be crucial in your calculation! 


Exercise 5.8. If you’ve never done so, find the induced s0(3) representation 7; on Ai by 
expressing (5.13) in spherical coordinates. You should get 


ð ð 
mm (Lx) = sings, + cot 8 cos #5 5 


ð ð 
Tı(Ly) = — Cos $x + cote sinipag 


a 0 
m(L;z) = EA 


Check directly that these satisfy the so(3) commutation relations, as they should. 


The representations we’ve discussed so far have primarily been representations 
of matrix Lie groups and their associated Lie algebras. As we mentioned earlier, 
though, there are abstract Lie algebras which have physically relevant representa- 
tions too. We’ll meet a couple of those now. 


Example 5.10. The Heisenberg algebra acting on L? (R) 


We’ve mentioned this representation a few times already in this text, but we discuss 
it here to formalize it and place it in its proper context. The Heisenberg algebra 
H = Span{q, p, 1} C C(R?) has a unitary Lie algebra representation x on L?(R) 
given by 


Bach Y € Ñ; is the restriction of a polynomial to S? and is hence continuous, hence |Y |? must 
have a finite maximum M € R. This implies 


(Y|Y) <42M < oo. 
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fD) = ix f(x) 


d 
(ep Ae) = -F0 


(m(f))(x) = if) (5.16) 


To verify that x is indeed a Lie algebra representation, one needs only to verify 
that the one nontrivial bracket is preserved, i.e. that [x (q4), 7(p)| = z (|q, p]). This 
should be a familiar fact by now, and is readily verified if not. Showing that z is 
unitary requires a little bit of calculation, which you will perform below. The factors 
of i appearing above, especially the one in (5.16), may look funny; you should keep 
in mind, though, that the absence of these factors of i in the physics literature is 
again an artifact of the physicist’s convention in defining Lie algebras (cf. Box 4.4), 
and that these factors of i are crucial for ensuring that the above operators are anti- 
Hermitian, as will be seen below. Note that the usual physics notation for these 
operators is Â = my = m(q) and p= n, = 7 (p). 
Exercise 5.9. Verify that 2(q),(p), and m(1) are all anti-Hermitian operators with 


respect to the usual inner product on L?(R). Exponentiate these operators to find, for 
t,a,0 ER, 


(7O fa) = e” f) 
(eD f)(x) = f(x —a) 
eO f(x) = e f(x) 
and conclude, as we saw (in part) in Example 4.38, that x(q) generates translations in 


momentum space, z(p) generates translations in x, and (1) generates multiplication by a 
phase factor. 


Example 5.11. The adjoint representation of C(P ) 


Recall from Example 4.40 that for any Lie algebra g, regardless of whether or not it 
is the Lie algebra of a matrix Lie group, there is a Lie algebra homomorphism 


ad: g —> gl(g) 


X |> ady 


and hence a representation of g on itself. Suppose that g = C (Rô), the Lie algebra of 
observables on the phase space R° which corresponds to a single particle living in 
three-dimensional space. What does the adjoint representation of C(IR°) look like? 
Since C(R°) is infinite-dimensional, computing the matrix representations of basis 
elements is not really feasible. Instead, we pick a few important elements of C (R) 
and determine how they act on C(IR°) as linear differential operators. First, consider 
adı; € gl(C(R°)). We have, for arbitrary f € C(R®), 
ada f = tg f} = È 
Pi 


208 5 Basic Representation Theory 


and so just as -2# generates translation in the x-direction, adq; = a generates 
translation in the p; direction in phase space. Similarly, you can calculate that 
d 3 (5.17) 
adp = —~— : 
P a qi 


so that ad,, generates translation in the q; direction. If f € C(IR°) depends only on 
the q; and not the p;, then one can show that 


of 
adz, f = -3p (5.18) 
where @ = tan™!(q2/q1) is the azimuthal angle, so that L3 generates rotations 


around the z-axis. Finally, for arbitrary f € C(IR°), one can show using Hamilton’s 
equations that 


d 
ady = —— (5.19) 
dt 


so that the Hamiltonian generates time translations. These facts are all part of the 
Poisson Bracket formulation of classical mechanics, and it is from this formalism 
that quantum mechanics gets the notion that the various “symmetry generators” 
(which are of course just elements of the Lie algebra of the symmetry group G in 
question) that act on a Hilbert space should correspond to physical observables. 
Exercise 5.10. Verify (5.17)—(5.19). If you did Exercise 4.33, you only need to ver- 


ðH — dpi aH — agi 
ðqi dt? dpi dt ` 


ify (5.19), for which you'll need Hamilton’s equations 


5.4 Tensor Product Representations 


The next step in our study of representations is to learn how to take tensor products 
of representations. This is important for several reasons: First, as we will soon see, 
almost all representations of interest can be viewed as tensor products of other, more 
basic representations. Second, tensor products are ubiquitous in quantum mechanics 
(since they represent the addition of degrees of freedom), so we better know how 
they interact with representations. Finally, tensors can be understood as elements 
of tensor product spaces (cf. Sect.3.5) and so understanding the tensor product 
of representations will allow us to understand more fully what is meant by the 
statement that a particular object “transforms like a tensor.” 

Suppose, then, that we have two representations (I1, V1) and (I2, V2) of a 
group G. Then we can define their tensor product representation (Il; ® Tb, 
Vi ® V2) by 


(II; ® I)e) = Mi (e) ® Ua(g) € LV 8 V2), (5.20) 
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where you should recall from (3.56) how IIi(g) & IIn(g) acts on Vi 8 V2. 
It is straightforward to check that this really does define a representation of G. 
If G is a matrix Lie group, we can then calculate the corresponding Lie algebra 
representation: 


(mı ® m2)(X) 


TT, (e*) Q Male”) o 


dt 
= TI, (e"*) 9 T,(e"*) — Hy (e*) 8 Me) _, 
rae h 
. E= @ T(e'¥)-1 8 ‘] 
= lim 
h—>0 h 
= tim [EE @ mle) -I @ eX) + 1 @ Tae") -1 BT 
hod h 
ul AX) uf hX) _ 
— lim (II, (e"*) — I) eme | + tim re 2(e"")—T) 
h->0 h h->0 h 


II 


m(X)@I1+1@m(X), (5.21) 


where in the second-to-last line we used the bilinearity of the tensor product. If we 
think of z; as a sort of “derivative” of II;, then one can think of (5.21) as a kind of 
product rule. In fact, the above calculation is totally analogous to the proof of the 
product rule from single-variable calculus! It is a nice exercise to directly verify that 
Tı ® m is a Lie algebra representation; this is Exercise 5.11 below. 


Exercise 5.11. Verify that (5.21) defines a Lie algebra representation. Mainly, this consists 
of verifying that 


(m 8 m2)(X), (11 8 m2) )] = (m 8 m2) ([X, Y]). 


You may find the form of (5.21) familiar from the discussion below (3.56), as 
well as from other quantum mechanics texts; we are now in a position to explain 
this connection, as well as clarify what is meant by the terms “additive” and 
“multiplicative” quantum numbers. 


Example 5.12. Quantum mechanics, tensor product representations, and additive 
and multiplicative quantum numbers 


We have already discussed how in quantum mechanics one adds degrees of 
freedom by taking a tensor product of Hilbert spaces. We have also discussed 
(in Example 4.37) how a matrix Lie group of symmetries of a physical system 
(i.e., a matrix Lie group G that acts on the phase space P and preserves the 
Hamiltonian H) gives rise to a Lie algebra of observables isomorphic to its own 
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Lie algebra g, and how the quantum-mechanical Hilbert space associated with that 
system should be a representation of g. Thus, if we have a composite physical system 
represented by a Hilbert space H = H  @ H2, and if the H; carry representations 7; 
of some matrix Lie group of symmetries G, then it’s natural to take as an additional 
axiom that G is represented on H by the tensor product representation (5.20), which 
induces the representation (5.21) of g on H. 

For example, let G be the group of rotations SO(3), and let Hi = L? (R?) 
correspond to the spatial degrees of freedom of a particle of spin s and Hz = C3 +!, 
2s € N correspond to the internal spin degree of freedom. Then s0(3) is represented 
on the total space H = L? (R?) @ C+! by 


(x @as)(Li) = 1(L;) 8 I + 1 @ rm (Li), (5.22) 


where z is the representation of Example 5.8 and zr, is the spin s representation 
from Example 5.7. If we identify (7 ® z;)(L;) with J;, the ith component 
of the total angular momentum operator, and x (L;) with L;, the ith component of 
the orbital angular momentum operator, and z,(L;) with S;, the ith component 
of the spin angular momentum operator, then (5.22) is just the component form of 


J=L@/I+/7@S, 


the familiar equation expressing the total angular momentum as the sum of the spin 
and orbital angular momentum. We thus see that the form of this equation, which 
we weren’t in a position to understand (mathematically) when we first discussed 
it in Example 3.20, is dictated by representation theory, and in particular by the 
form (5.21) of the induced representation of a Lie algebra on a tensor product 
space. The same is true for other symmetry generators, like the translation generator 
p in the Heisenberg algebra. If we have two particles in one-dimension with 
corresponding Hilbert spaces H;, i = 1,2, along with representations x; of the 
Heisenberg algebra, then the representation of p on the total space H = Hı ® H2 
is just 


(1 ® m2)(p) = m(p) @1+1@m(p) = pi ®1+1® pr, 


where p; = 7; (p). This expresses the fact that the total momentum is just the sum 
of the momenta of the individual particles! 

More generally, (5.21) can be seen as the mathematical expression of the fact 
that physical observables corresponding to generators in the Lie algebra are 
additive. More precisely, we have the following: let v; € H;, i = 1,2 be 
eigenvectors of operators 2;(A) with eigenvalues a;, where A is an element of the 
Lie algebra of a symmetry group G. Then v; ® v2 is an eigenvector of (x1 @ m2)(A) 
with eigenvalue a; + a2. In other words, the eigenvalue a of A is an additive 
quantum number. Most familiar quantum numbers, such as energy, momentum, 
and angular momentum, are additive quantum numbers, but there are exceptions. 
One such exception is the parity operator P. If we have two three-dimensional 
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physical systems with corresponding Hilbert spaces H;, then the H; should furnish 
representations IT; of O(3). Now, P = —I e€ O(3), and if v; € H; are eigenvectors 
of II;(P) with eigenvalues 4;, then vı ® v2 € Hı ® H2 has eigenvalue 

(CL: 8 T2)(P))(v1 8 v) = Mi (P)vi ® Tp(P)v2 
(AjA2)v1 8 v2, 


where we used the property (3.37c) of the tensor product in the last line. Thus parity 
is known as a multiplicative quantum number. This is due to the fact that the parity 
operator P is an element of the symmetry group, whereas most other observables 
are elements of the symmetry algebra (usually the Lie algebra corresponding to the 
symmetry group). Oo 


Our next order of business is to clarify what it means for an object to “transform 
like a tensor.” We already addressed this to a certain degree in Sect.3.2 when we 
discussed change of bases. There, however, we looked at how a change of basis 
affects the component representation of tensors, which was the passive point of view. 
Here we will take the active point of view, where instead of changing bases we will 
be considering a group G acting on a vector space V via some representation IT. 
Taking the active point of view should nonetheless give the same transformation 
laws, and we’ll indeed see that considering tensor product representations of G in 
components reproduces the formulae from Sect. 3.2, but in the active form. 

Recall from Sect. 3.5 that the set of tensors of rank (r, s) on a vector space V is 
just 


TUV) =V*®@....@V*@V@--OV. 
=——_———” ——— 
r times s times 


Given a representation II of G on V, we’d like to extend this representation to the 
vector space 77 (V). To do this, we need to specify a representation of G on V*. 
This is easily done: V* is a vector space of functions on V (just the linear functions, 
in fact), so we can use (5.9) to obtain the dual representation (TI*, V*), defined as 


Ao) = fz) geG, fev*,vev. (5.23) 


This representation has the nice property that if {e;} and {e’} are dual bases for V 
and V*, then the bases {TI (g)e;} and {I1*(g)e'} are also dual to each other for any 
g € G. You will check this in Exercise 5.12 below. 
With the dual representation in hand, we can then consider the tensor product 
representation 
T=" @®....@ TM @N®::-@N. 
E — —-—_— ——— > 


r times s times 
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This is just given by applying by the appropriate operator II(g) or I1*(g) to each 
factor in the tensor product, i.e. 


(15 (g))(A@ ++: @f,@v1® ++: Ov, )=M"(g) /fi@-+- @I*(g) f, @M(g)v1@ ++ @IM(g)vs. 


(5.24) 


If G is a matrix Lie group, then the corresponding Lie algebra representation 7z” is 
given by repeated application of the product rule (5.21), which produces r + s terms 
with either a x(g) or a 2*(g) acting on one of the factors in each term. That is, 
(x5 (g))(fi @-++® fr @ v1 @-+- vs) 
=I" (g)fi@-@ f DUDU 

Hfi @W( AB: Bf, ONDU 

+f @---@x"(g) fr DV @---@vs 

+f fr Q T(E) ®---@ vs 

+f D8 fr @ V1 OAB H 

+f, @+++@ fi Du alg). (5.25) 


We’ll get a handle on these formulae by considering several examples. 


Exercise 5.12. Let (II, V) be a representation of some group G and (II*, V*) its dual 
representation. Show that if the bases {e;} and {e'} are dual to each other, then so are 
{TI (g)e;} and {T1*(g)e’} for any g € G. 


Exercise 5.13. Alternatively, we could have defined I} by thinking of tensors as functions 
on V and using the idea behind (5.9) to get 


(If (g)T) (1 poses urs fi pees Ss) = T (I-10 EERE) II,-1v,, Ta fis Tg) 
Expand T in components and show that this is equivalent to the definition (5.24). 
Example 5.13. The dual representation 


Let’s consider (5.24) with r = 1,5 = 0, in which case the vector space at hand is 
just V* and our representation is just the dual representation (5.23). What does this 
representation look like in terms of matrices? For arbitrary f € V*, v € V we have 


NE) = (TE fT" Te] 
= (MAIL) WI 
= [f] m] [v] 
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as well as 


T f)(v) = fM) 
= [fle] 


which implies 


A = [M] = g] = (J. 


(Why are those last two equalities true?) We thus obtain 


[>] = (a, (5.26) 


In other words, the matrices representing G on the dual space are just the inverse 
transposes of the matrices representing G on the original vector space! This is 
just the active transformation version of (3.24b). Accordingly, if (II, V) is the 
fundamental representation of O(n) or SO(n), then the matrices of the dual 
representation are identical to those of the fundamental, which is just another 
expression of the fact that dual vectors transform just like regular vectors under 
orthogonal transformations (passive or active). 

If G is a matrix Lie group, then (II, V) induces a representation (x, V) of g, 
and hence (II*, V*) should induce a representation (7*,V*) of g as well. What 
does this representation look like? Well, by the definition of 2*, we have (for any 
Xeg, fev*,veyv) 


II 


d 
(x*(X) fv) g ON 


= d —tX 
= que )v)|r=0 
f(-2(X)v). 


II 


You will show below that in terms of matrices this means 


[x*(X)] = -ir (X N". (5.27) 


Note again that if m is the fundamental representation of O(n) or SO(n) then 
[x(X)] = X is antisymmetric and so the dual representation is identical to the 
original representation. 

Exercise 5.14. Prove (5.27). This can be done a couple different ways, either by calculating 


the infinitesimal form of (5.26) (by letting g = e’* and differentiating at £ = 0) or by a 
computation analogous to the derivation of (5.26). 
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Example 5.14. q! , L(V )) The linear operator representation 


Now consider (5.24) with (r, s) = (1, 1). Our vector space is then just Tal = L(V), 
the space of linear operators on V! Thus, any representation of a group G on a 
vector space V leads naturally to a representation of G on the space of operators 
on V. What does this representation look like? We know from (5.24) that G acts on 
Ti by 


OD ® v) = Of) (Mv), vEeVv, feV*, geG. (5.28) 


This isn’t very enlightening, though. To interpret this, consider f & v as a linear 
operator T on V, so that 


T(w) = f(w)v, we. 
Then (careful working your way through these equalities!) 


ODT) Ww) = (HE fw) gv 


= f(II,-1w) Hv by definition of TI% 
= Ig(f(T,-1w)v) since I, linear 

= M (T07 w) by definition of T 
= (M,TMz')(w) 


so we have 


Dig? = MTA". (5.29) 


It’s easy to check that this computation also holds for an arbitrary T € L(V), since 
any such T can be written as a linear combination of terms of the form f @ v. 
Thus, (5.29) tells us that the tensor product representation of G on V* @V = L(V) 
is just the original representation acting on operators by similarity transformations! 
This should not be too surprising, and you perhaps could have guessed that this is 
how the action of G on V would extend to L(V). Representing (5.29) by matrices 
yields 


[8T] = Melre (5.30) 
which is just the active version of (3.27). 
If G is a matrix Lie group, we can also consider the induced Lie algebra rep 


(x; ,£(V)), which according to (5.21) acts by 


XD 8v) = A*S) Dv + f @a(X)v. 
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Since we saw in Chap. 4 that the adjoint representation of G induces the adjoint 
representation of g, we expect that x} should act by the commutator. This is in fact 
the case, and you will show below that 


m1 (X)T = [x(X), T]. (5.31) 


This, of course, reduces to a commutator of matrices when a basis is chosen. Oo 


Exercise 5.15. Prove (5.31). As in Exercise 5.14, this can be done by either computing the 
infinitesimal form of (5.29), or performing a calculation similar to the one above (5.29), 
starting with the Lie algebra representation associated with (5.28). 


It should come as no surprise that the examples above reproduced the formulae 
from Sect. 3.2, and in fact it’s easy to show that the general tensor product represen- 
tation (5.24) is just the active version of our tensor transformation law (3.17). Using 
the fact that 


T5); = Canoe as (5.32) 
which you will prove below, we have for an arbitrary (r, s) tensor T, 


IN (g)T = T, 4, et @ ++ Be” B ej, ®-+ e) 


= se dest že Qe ® Tite!” Q Meej Q&Q Hej, 
= OTE Tg) i! Tg) T gp g 
Oe Ber @ Bei, (5.33) 


so that, relabeling dummy indices, we have 


(II OP" — (ue a See aie), (Cus ” oo. (e); Tot ]y...Ls 
(5.34) 
which is just the active version of (3.17), with TI(g) replacing A and II(g)7! 
replacing A~!. 
Exercise 5.16. Verify (5.32). 


5.5 Symmetric and Antisymmetric Tensor Product 
Representations 


With the tensor product representation now in place, we can now consider sym- 
metric and antisymmetric tensor product representations. These are important for 
a few reasons. The symmetric tensor product, when applied to the fundamental 
representation of SU(2), actually yields all the spin s representations of SU(2). 
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Meanwhile, the antisymmetric tensor product, when applied to the fundamental 
representations of O(3) and O(3,1), yields the pseudovector and pseudoscalar 
representations of these groups. 

Before proceeding to these (and other) examples, we must convince ourselves 
that the spaces of symmetric and antisymmetric tensors on a representation space 
are in fact representations in their own right. Note that for tensors of type (r, 0) or 
(0, r) the tensor product representation (5.24) is symmetric, in the sense that all the 
factors in the tensor product are treated equally. A moment’s thought then shows 
that if we have a completely symmetric tensor T € S’(V), then I1°(g)T is also 
in S’(V). The same is true for the completely antisymmetric tensors A” V, and for 
the spaces S’(V*) and A’V*. Thus these subspaces of 7,°(V) and 7,” (V) indeed 
furnish representations of G in their own right. 


Example 5.15. S’ (C?) 


Consider S'(C”), the completely symmetric (0,/) tensors on C?. By way of 
example, when / = 3 a basis for this space is given by 


Vo = €1 Q e1 Sey 
U1 = e2 Wei Qe +e] S C2 Gli + ei gege 
(5.35) 
U2 = 2 Qe Qei +e De Re + €2 He; Wer 
U3 = e2 B e2 WB e2 
and in general we have 
Vo = €1 89eg: Wey 
v = e2 Se; 8: Qe + permutations 
V2 = €2 Qe @ €; 8- e + permutations 
(5.36) 


Vi-1 = 2 Q e2 ®-+- ® e; + permutations 


V =e @ 2 B+: Bp. 


Now consider the fundamental representation (II,C*) of SU(2). The tensor 
product representation (II?, 7P (C?)) restricts to SC?) c TP(C?), yielding a 
representation of SU(2) which we’ll denote as (S/TI, S! (C?)). Taking (5.36) as a 
basis for this space, one can easily compute the corresponding su(2) representation, 
denoted (S!z, S! (C?)). It’s then easy to show that 
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1/2 


1/2—1 


KS rS] = —i (5.37) 


a2 


which is the same as (5.11) (up to a sign; this can be eliminated by reversing the 
order of the basis vectors). This suggests that S’ (C?) is the same representation 
as Pı(C?), and is thus also the same as the spin //2 representation of su(2). We 
will soon see that this is indeed the case. Note that we have a correspondence here 
between symmetric (0,/) tensors on a vector space and degree / polynomials on a 
vector space, just as we did in Example 3.24. 


Exercise 5.17. Verify (5.37). 


Example 5.16. Antisymmetric 2nd rank tensors and the adjoint representation 
of O(n) 


Consider the fundamental representation (TI, R”) of O(n). We can restrict the tensor 
product representation (I19, R” QR”) to the subspace A7R" to get the antisymmetric 
tensor product representation of O(n) on A7R", which we’ll denote as A°TI. Let 
X = Xe; ® e; € APR” with the standard basis. Then by (5.24), we have (for 
Re O(n)) 


A7TI(R)(X) = XY Re; ® Re; = Y_ X” RyRy ek Q ei 
k,l 


which in terms of matrices reads 
[A7TI(R)X] = R[X]R? = R[X]R!. (5.38) 


So far we have just produced the active transformation law for a (0, 2) tensor, and we 
have not yet made use of the fact that X is antisymmetric. Taking the antisymmetry 
of X into account, however, means that [X] is an antisymmetric matrix, and (5.38) 
tells us that it transforms under O(n) by similarity transformations. This, however, 
is an exact description of the adjoint representation of O(n)! So we conclude that 
the adjoint representation of O(n) (and hence of SO(n) and so(n)) are the same as 
the tensor product representation (A° TI, A? R”). Oo 


Example 5.17. Antisymmetric tensor representations of O(3) 


Consider the antisymmetric tensor representations (A* II, A‘R*), k = 1,2,3 of 
O(3), obtained by restricting (I1?, 7,°(IR*)) to AIR? C 72(R°). For convenience 
we’ll define ACR? to be the trivial representation on R (also known as the scalar 
representation). Now, we already know that A'R* = R? is the fundamental 
representation, and from the previous example we know that AR? is the adjoint 
representation, also known as the pseudovector representation. What about A*IR?? 
We know that this vector space is one-dimensional (why?), so is it just the trivial 
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pata 33 eo v |08 SOB) |U, -—1} = Z 

antisymmetric tensor TE ; To DEEE 

representations AFR? of AYR? | Scalar Scalar | Trivial 

SO(3) and O(3) A'R? | Vector Vector | Alt 
APR? | Pseudovector | Vector | Trivial 


AR? | Pseudoscalar | Scalar | Alt 


representation? Not quite. Taking the Levi-Civita tensor e; A e2 A e3 as our basis 
vector, we have: 


(A7TI(R))(e1 A e2 A e3) = (Rei) A (Reo) A (Res) 
= |R| e1 A e2 ^ e3 by (3.90) 


= +e; Ae A 83. 


Thus e; A e2 A e3 is invariant under rotations but is still not a scalar, since it changes 
sign under inversion. An object that transforms this way is known as a pseudoscalar, 
and AR? is thus known as the pseudoscalar representation of O(3). Note that 
if we restrict to SO(3), then |R| = 1 and A?R? is then just the scalar (trivial) 
representation, just as A?R? is just the vector representation of SO(3). Also, as we 
mentioned before, if we restrict our representations to Z) ~ {7,—J} C O(3) then 
they reduce to either the trivial or the alternating representation. We summarize all 
this in the Table 5.3. 


Example 5.18. Antisymmetric tensor representations of O(3, 1) 


Here we repeat the analysis from the previous example but in the case of O(3, 1). 
As above, A°R* = R is the trivial (scalar) representation, A'R+ = R* is the 
vector representation, and A?R* is just known as the 2nd rank antisymmetric tensor 
representation (or sometimes just tensor representation). How about A*R*? To get 
a handle on that, we’ll compute matrix representations for the corresponding Lie 
algebra representation, using the following basis B for A?R*: 


fi=anhe3Ae4 
fi = —e1 A e3 ^e, 
fp =erAerres 
fa = e1 A 02 A e3. 


You will check below that in this basis, the operators (A32)(L;) and (A3sr)(K;) 
are given by 


[A rE] = Li 
(5.39) 
[(A°r)(K:)]s = Ki 
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so (A3zr, A3R*) is the same as the fundamental (vector) representation of s0(3, 1)! 
When we consider the group representation A*IT, though, there is a slight difference 
between AR‘ and the vector representation; in the vector representation, the parity 
operator takes the usual form 


1 0 


Il(P) = P= (5.40) 


0 — 
0 0-1 
0 0 


whereas on A?IR*, we have (as you will check below) 


100 0 
010 0 

A? = Al 

[A°T(P)]s 0010 (5.41) 


000-1 


which is equal to —P. Thus the elements of A*IR* transform like four-vectors 
under infinitesimal Lorentz transformations (and, as we’ll show, under proper 
Lorentz transformations), but they transform with the wrong sign under parity. 
In analogy to the three-dimensional Euclidean case, A*IR* is thus known as the 
pseudovector representation of O(3, 1). As in the Euclidean case, if one restricts to 
SO(3, 1)o, then parity is excluded and then A+R‘ and R4 are identical, but only as 
representations of the proper Lorentz group SO(3, 1)o. 

The next representation, A‘+IR*, is one-dimensional, but (as in the previous 
example) is not quite the trivial representation. As above, we compute the action 
of A*TI(A), A € O(3, 1), on the Levi-Civita tensor e; A e2 A e3 A e4: 


(A*TT(A))(e1 A e2 A e3 A e4) = (Aes) A (Ae) A (Aes) A (Aea) 
= |Ale; A en A e€3 A €4 


= +e Ae A83 Aez. 


Thus e; A e2 A e3 A eg is invariant under proper Lorentz transformations but 
changes sign under improper Lorentz transformations. As in the Euclidean case, 
such an object is known as a pseudoscalar, and so (ATI, A*IR*) is known as the 
pseudoscalar representation of O(3, 1). 

As in the previous example, we can use {/, P} ~ Z, to distinguish between 
AR’ and R*, as well as between A*R‘ and the trivial representation. The operators 
TI(P) and AŝTI(P) can be distinguished in a basis-independent way by noting 
that TI(P) is diagonalizable with eigenvalues {—1,—1,—1, 1}, whereas A7TI(P) 
has eigenvalues {1, 1, 1, —1}. We again summarize in a Table 5.4, where in the last 
column we write the eigenvalues of A‘ II(P) in those cases where A*TI(P) # +I: 
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Table 5.4 The v 068,1) SOB, Do | {, P} XZ 
antisymmetric tensor 


representations AFR of A°R* | Scalar Scalar Trivial 

SO(3, 1) and O(3, 1) A'R* | Vector Vector {—1,—-1,—-1, 1} 
A?R* | Tensor Tensor {1,1,1,—1,-1,-1} 
A?}R* | Pseudovector | Vector {1,1,1,—1} 
A‘R* | Pseudoscalar | Scalar Alt 


Exercise 5.18. By explicit computation, verify (5.39) and (5.41). Also, show that A°TI (P) 
has eigenvalues {—1, —1,—1, 1, 1, 1}. You may need to choose a basis for A?R* to do this. 


5.6 Equivalence of Representations 


In the previous section we noted that the vector and dual vector representations of 
O(n) (and hence SO(n) and so(n)) were “the same,” as were the adjoint represen- 
tation of O(n) and the antisymmetric tensor product representation (A711, A7R"). 
We had also noted a few equivalences in Sect. 4.1, where we pointed out that the 
adjoint representation of SO(3) was, in a certain matrix representation, identical 
to the vector representation of SO(3), and where we claimed that the adjoint 
representation of s0(3,1) is equivalent to the antisymmetric 2nd rank tensor 
representation. However, we never made precise what we meant when we said 
that two representations were “the same” or “equivalent”; in the cases where we 
attempted to prove such a claim, we usually just showed that the matrices of 
two representations were identical when particular bases were chosen. Defining 
equivalence in such a way is adequate but somewhat undesirable, as it requires 
a choice of basis; we’d like an alternative definition that is more intrinsic and 
conceptual and that doesn’t require a choice of coordinates. The desired definition 
goes as follows: Suppose we have two representations (IT;, V1) and (I2, V2) of a 
group G. A linear map ¢ : V; —> V2 which satisfies 


TIo(g)(¢(v)) = plg) Yuen, g eG (5.42) 


is said to be an intertwining map or intertwiner. This just means that the action 
of G via the representations commutes with the action of @. If in addition ¢ is a 
vector space isomorphism, then (I, V1) and (I2, V2) are said to be equivalent. 
Occasionally we'll denote equivalence by (111,Vi) ~ (Tz, V2). Equivalence of 
Lie algebra representations and their corresponding intertwining maps are defined 
similarly, by the equation 


m2(X)(P(v)) = P(m1(X)v) VuEeN, X eg. (5.43) 
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Another way to write (5.42) is as an equality between maps that go from V; to V2: 


Ta(g)og =Golli(g) VgeG (5.44) 


and likewise for Lie algebra representations. When ¢ is an isomorphism, this can be 
interpreted as saying that IT,;(g) and II2(g) are the “same” map, once we use the 
intertwiner @ to identify V; and V2. Another way to interpret this is to choose bases 
for V; and V and then write (5.44) as 


Ole] = [¢][Mi(g)] Veg eG 


or 


MOD] = [WIM (gl Yg eEG 


which says that the matrices [TI2(g)] and [I1,(g)] are related by a similarity 
transformation. You will use this below to show that our definition of equivalence 
of representations is equivalent to the statement that there exists bases for V; and 
Vz such that [I;i (e)] = [Tlo(g)] Vg € G (or the analogous statement for Lie 
algebras). 


Exercise 5.19. Show that two representations (IT,, V;) and (I2, V2) are equivalent if and 
only if there exist bases 5B; C V; and B2 C Vz such that 


[M (e)ls = [Tlo(g)] a, Veg EG. (5.45) 


Exercise 5.20. Let (II;, V;), i = 1,2 be two equivalent representations of a group G, and 
let H C G be a subgroup. Prove that restricting II; : G > GL(V;) to maps I; : H > 
GL(V;) yield representations of H , and that these representations of H are also equivalent. 
Thus, for example, equivalent representations of O(n) yield equivalent representations of 
SO(n), as one would expect. 


Before we get to some examples, there are some immediate questions that arise. 
For instance, do equivalent representations of a matrix Lie group G give rise to 
equivalent representations of g, and conversely, do equivalent representations of 
g come from equivalent representations of G? As to the first question, we would 
expect heuristically that since g consists of “infinitesimal” group elements, equiva- 
lent group representations should yield equivalent Lie algebra representations. This 
is in fact the case: 


Proposition 5.2. Let G be a matrix Lie group, and let (T1;,V;) i = 1,2 be 
two equivalent representations of G with intertwining map $. Then @ is also an 
intertwiner for the induced Lie algebra representations (1; , V;), and so the induced 
Lie algebra representations are equivalent as well. 


Proof. We proceed by direct calculation. Since ¢ is an intertwiner between (IT), V1) 
and (Iz, V2) we have 


T12(e* )(o(v)) = o(My(e*)v) Vu eV, X eg t ER. 
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Taking the time derivative of the above equation, evaluating at t = 0, and using the 
definition of the induced Lie algebra representations z;, as well as the fact that ¢ 
commutes with derivatives (cf. Exercise 4.37), we obtain 


m2(X)(P(v)) = p(X) VueN, X eg 


and so mı and m are equivalent. (You should explicitly confirm this as an exercise 
if more detail is needed.) Oo 


The second question, of whether or not equivalent representations of g come 
from equivalent representations of G, is a bit trickier. After all, we noted in the 
introduction to this chapter (and will see very concretely in the case of s0(3)) that not 
every representation of g necessarily comes from a representation of G. However, 
if we know that two equivalent Lie algebra representations (7;,V;), i = 1,2 
actually do come from two group representations (II;, V;), and we know the group 
is connected, then it is true that IT; and IT, are equivalent. 


Proposition 5.3. Let G be a connected Lie group and let (TI;, V;), i = 1,2 be 
two representations of G, with associated Lie algebra representations (m;, V;). 
If the Lie algebra representations (7i, V;) are equivalent, then so are the group 
representations (TI; , V;) from which they came. 


Proof. The argument relies on the following fact, which we will not prove (see Hall 
[11] for details): if G is a connected matrix Lie group, then any g € G can be 
written as a product of exponentials. That is, for any g € G there exist X; € g, t; € 
R, i =1,...,n such that 

g = elen% ... emn, (5.46) 
(In fact, for all the connected matrix Lie groups we’ve met besides SL(2, C), every 
group element can be written as a single exponential. For SL(2,C), the polar 
decomposition theorem [see Problem 4-6] guarantees that any group element can 
be written as a product of two exponentials.) With this fact in hand we can show 
that IT; and IT, are equivalent. Let ¢ be an intertwining map between (7, V1) and 
(2, V2). Then for any g € G and v € V; we have 


To(g)(@(v)) = Mz (e" Xet ..-e%*)(M(v)) 
= To(e"'*!) 115 (e2*2) ee Ta (e"*")((v)) since I> is a homomorphism 
= (eft 7241) pfa72(X2) see elnt2Xn) (6 (v)) by definition of 72 
= o (e T XD ehm (X2) tee ef (Xa) y) by Exercise 5.21 below 


= ¢(T1 (eX) (e2*2) ++ Ty (el )v) by definition of 71 
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= d(T, (ef X1 giaX2 <- e""¥n)v) since IT; is a homomorphism 


= ġ (Il (8)v) 


and so IT; and Ih, are equivalent. Oo 


Proposition 5.3 is useful in that it allows us to prove equivalence of group 
representations by examining the associated Lie algebra representations, which are 
(by virtue of linearity) often easier to work with. 


Exercise 5.21. Let 2; and m, be two equivalent representations of a Lie algebra g with 
intertwining map @. Prove by expanding the exponential in a power series that 


EN og = poe™ WX Eg. 


Exercise 5.22. We claimed above that when a matrix Lie group G is connected, then any 
group element can be written as a product of exponentials as in (5.46). To see why the 
hypothesis of connectedness is important, consider the disconnected matrix Lie group O(3) 
and find an element of O(3) that cannot be written as a product of exponentials. 


Now it’s time for some examples. 
Example 5.19. Equivalence of s0(3) and R? as SO(3) representations 


We know that the adjoint representation and the fundamental representation of s0(3) 
are equivalent, by Exercise 5.19 and the fact that in the B = {L;};=13 basis we 
have [ad;,] = L;. What, then, is the intertwining map between R? and so0(3)? 
Simply the map 


Q: so(3) > R? 
0 =z y 
Z 0 =x E (x,y,z) 
-y x 0 


which is just the map X +> [X],. To verify that this is an intertwiner we’ll actually 
work on the Lie algebra level, and then use Proposition 5.3 to conclude that the 
representations are equivalent on the group level. To verify that @ satisfies (5.43), 
one need to only prove that the equation holds for an arbitrary basis element of 
so(3); since the m; are linear maps, we can expand any X € s0(3) in terms 
of our basis and the calculation will reduce to verifying the equality just for the 
basis elements. (This is the advantage of working with Lie algebras; facts about 
representations [such as equivalence] are usually much easier to establish directly 
for Lie algebras than for the corresponding groups, since we can use linearity.) We 
thus calculate, for any Y € so0(3), 


(p oady;, )(Y) = ġ (ad; Y) 
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= [adz, Y] 
= [ad;,][Y] 
= L;[Y] 


while 
(Li o ¢)(Y) = L: [Y]. 


Hence L; o ġ = ¢ 0 adz,, and so ¢ is an intertwining map and the fundamental and 
adjoint representations of $o (3) are equivalent. 

By Proposition 5.3, we can then conclude that the adjoint and fundamental 
representations of SO(3) are equivalent, since SO(3) is connected. What about 
the adjoint and fundamental representations of O(3)? You will recall that O(3) is 
not connected (and in fact has two separate connected components), so we cannot 
conclude that its adjoint and fundamental representations are equivalent. In fact, as 
we mentioned before, we know that these representations are not equivalent, since 
if @ : so(3) > R? were an intertwining map, we would have for any Y € s0(3), 


p(Ad_;Y) = @(Y) © since Ad_; is the identity 
as well as 


o(Ad_;Y) 


II 


(—I)¢(Y) since ¢ is an intertwiner 
= —ġ(Y), 


a contradiction. Thus ġ cannot exist, and the fundamental and adjoint representa- 
tions of O (3) are inequivalent. 

We summarize all this in Table 5.5 below, which is in part a subset of Table 5.3. 
Here, we emphasize that the equivalences in the first two columns imply each other, 
as a result of Propositions 5.2 and 5.3. Furthermore, the inequivalence in the case of 
O (3) shows the necessity of the connectedness hypothesis in Proposition 5.3, and 
also that the inverse to Proposition 5.2 is not true. 


Exercise 5.23. Use Example 5.17 and Exercise 5.19 to deduce that R and A*R? are 
equivalent as SO(3) representations (in fact, they are both the trivial representation). Can 
you find an intertwiner? Do the same for the SO(3, 1), representations R and A*R*, using 
the results of Example 5.18. Using Proposition 5.3, also conclude that R* and A*R¢ are 
equivalent as SO(3, 1), representations. 


Table 5.5 Summary of the 503) | SO(3) | OG) 
fundamental and adjoint 
representations and their 
equivalences for s0(3), K K + 

SO(3), and O(3) so(3) (adjoint) Vector | Vector | Pseudovector 


R? (fundamental) | Vector | Vector | Vector 
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Exercise 5.24. Reread Sect. 3.10 in light of the last few sections. What would we now call 
the map J that we introduced in that section? 


Example 5.20. Vector spaces with metrics and their duals 


Let V be a vector space equipped with a metric g (recall that a metric is any 
symmetric, non-degenerate bilinear form), and let (II, V) be a representation of 
G whereby G acts by isometries, i.e. II(g) € Isom(V) Y g € G. Examples of this 
include the fundamental representation of O(n) on R” equipped with the Euclidean 
metric, or the fundamental representation of O(n — 1, 1) on R” with the Minkoswki 
metric, but not U(n) or SU(n) acting on C” with the standard Hermitian inner 
product (why not?). If the assumptions above are satisfied, then (TI, V) is equivalent 
to the dual representation (II*, V*) and the intertwiner is nothing but our old friend 


L:V => V* 
vb g(v,-). 


You will verify in Exercise 5.25 that L is indeed an interwiner, proving the asserted 
equivalence. Thus, in particular we again reproduce (this time in a basis-independent 
way) the familiar fact that dual vectors on n-dimensional Euclidean space transform 
just like ordinary vectors under orthogonal transformations. Additionally, we see 
that dual vectors on n-dimensional Minkowski space transform just like ordinary 
vectors under Lorentz transformations! See Exercise 5.26 for the matrix manifesta- 
tion of this. 

The reason we’ve excluded complex vector spaces with Hermitian inner products 
from this example is that in such circumstances, the map L is not linear (why not?) 
and thus can’t be an intertwiner. In fact, the fundamental representation of SU(n) 
on C” forn > 3 is not equivalent to its dual,'° and the dual representation in 
such circumstances can sometimes be interpreted as an antiparticle if the original 
representation represents a particle. For instance, a quark can be thought of as a 
vector in the fundamental representation C? of SU(3) (this is sometimes denoted 
as 3 in the physics literature), and then the dual representation C** corresponds to 
the antiquarks (this representation is often denoted as 3). For SU(2), however, the 
fundamental (spinor) representation is equivalent to its dual, though that doesn’t 
follow from the discussion above. This equivalence is the subject of the next 
example. 

Exercise 5.25. Let V be a vector space equipped with a metric g and let (II, V) be a 

representation of a group G by isometries. Consider L : V —> V* as defined above. Prove 


that Lo Hy = fo L forall g € G. Since (L o I1,)(v) € V* for any v € V, you must 
show that 


(L o I1g)(v) = (Ig ° L)(v) 


as dual vectors on V, which means showing that they have the same action on an arbitrary 
second vector w. 


10We won’t prove this here; see Hall [11] for details. 
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Exercise 5.26. Let (II, R*) be the fundamental representation of O(3, 1) on R4, and let 
B = {e;};=1-4 be the standard basis for R* and B* = {e'};—)-4 the corresponding dual 
basis. Find another basis B*’ for R** such that 


[1*(g)la~ = [T1(g)]e. 


The map L might help you here. Does this generalize to the fundamental representation of 
O(n — 1, 1) on n-dimensional Minkowski space? 


Example 5.21. The fundamental (spinor) representation of SU(2) and its dual 


Consider the fundamental representation (II, C?) of SU(2) and its dual repre- 
sentation (II*, C7*). We’ll show that these representations are equivalent, by first 
showing that the induced Lie algebra representations are equivalent and then invok- 
ing Proposition 5.3. We’ll show equivalence of the Lie algebra representations by 
exhibiting a basis for C* that yields the same matrix representations for (7*, C*) 
as for (x, C?). For an intrinsic (coordinate-independent) proof, see Problem 5-3. 
As should be familiar by now, the fundamental representation of su(2) is just the 


identity: 
0 -i 
—i 0 
1/0-1 
ms=5,=5(9 5] 


1 (-i0 
m8)=5.=5(99)- 


This, combined with (5.27), tells us that in the standard dual basis b*, 


E 1 On 
miso =3 (90) 
mshi) 


meaa) 


Now define a new basis B* = {e!, e?” } for C* by 


m(Sx) = Sx 


II 


¥ 
Ve? 


e 


tA 
e =el. 
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You can check that the corresponding change of basis matrix A is 


0-1 
a=(' i (5.47) 


You can also check that in this new basis, the operators 7 * (S$;) are given by 


[x*(Ss)]p~ = Alz*(Sx)]a*A™! = Sx 
[x*(Sy)]a~ = Alz*(Sy)]a* A! = Sy (5.48) 
[x*(S)]s« = Ala” (S)]e* A7 = S: 


and thus (z*,C?*) is equivalent to (x, C?). The connectedness of SU(2) and 
Proposition 5.3 then imply that (II*, C*) is equivalent to (II, C7), as claimed. 


Exercise 5.27. Verify (5.47) and (5.48). 


Exercise 5.28. Extend the above argument to show that the fundamental representation of 
SL(2, C) is equivalent to its dual. 


Example 5.22. Antisymmetric 2nd rank tensors and the adjoint representation of 
orthogonal and Lorentz groups 


We claimed in Sect.4.1 that for the Lorentz group, the adjoint representation is 
“the same” as the antisymmetric 2nd rank tensor representation (the latter being 
the representation to which the electromagnetic field tensor F“” belongs). We also 
argued in Example 5.16 that the same is true for the orthogonal group O(n). Now it 
is time to precisely and rigorously prove these claims. Consider R” equipped with 
a metric g, where g is either the Euclidean metric or the Minkowski metric. The 
isometry group G = Isom(V) is then either O(n) or O(n—1, 1), and g is then either 
so(n) or so(n — 1, 1). To prove equivalence, we need an intertwiner ¢ : A7R” — g. 
Here we’ll define ¢ abstractly and use coordinate-free language to prove that it’s an 
intertwiner. The proof is a little long, requires some patience, and may be skipped 
on a first reading, but it is a good exercise for getting acquainted with the machinery 
of this chapter; for an illuminating coordinate proof, see Problem 5-2. 

To define ¢ abstractly we interpret g not as a space of matrices but rather as linear 
operators on R”. Since A?R” is just the set of antisymmetric (0, 2) tensors on R”, 
we can then define ¢@ by just using the map L to “lower an index” on an element 
T € A?’R", converting the (0,2) tensor into a (1, 1) tensor, i.e. a linear operator. 
More precisely, we define the linear operator ġ (T) as 


Tw, f) = T(L(v), f) ve R”, f eR”. 


We must check, though, that (7) € g. This will be facilitated by characterizing 
X € g by (4.55), which in this context we write as 


g(Xv,w) + g(v, Xw) = 0. (5.49) 
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Thus, to show that the range of @¢ really is g, we just need to show that ¢(T) as 
defined above satisfies (5.49): 
g(o(T)v, w) + gv, d(T )w) 

= $(T)(v, L(w)) + o(T)(w, L(v)) by definition of L 

= T(L(v), L(w)) + T(L(w), L(v)) by definition of ¢ (T) 

=0 by antisymmetry of T. 
Thus ¢(T) really is in g. To show that ¢ is an intertwiner, we need to show that 
$ o A*TI(R) = Ad(R) o ¢ for all R € G. To do this, we'll employ the alternate 


definition of the tensor product representation given in Exercise 5.13, which in this 
case says 


(A7T(R)T)(f,h) = T(O* (R7) f, T*(R"')h), REG, fheR™. (5.50) 
We then have, for all v € R”, f € R”* (careful with all the parentheses!), 


(A7T1(R)(T))(L(v), f) 
= T(II*(R"')L(v), 1*(R7!) f) 
T(L(R'v), ep), 


(po A*T(R)\(T))(v, f) 


II 


II 


where in the last equality we used the fact that L is an intertwiner between R” and 
R”*. Again employing the alternate definition of the tensor product representation, 
but this time for Tl, we also have 


((Ad(R) © HXT) w, f) = (TXR v, TI* (R) f) 
= T(L(R'v), I* (R) f) 


and we can thus conclude that ¢ o (A°TI(R)) = Ad(R) o ¢, as desired. 

So what does all this tell us? The conclusion that the tensor product rep- 
resentation of G on antisymmetric 2nd rank tensors coincides with the adjoint 
representation of G on g is not at all surprising in the Euclidean case, because 
there g is just so(7), the set of all antisymmetric matrices! In the Lorentzian case, 
however, we might be a little surprised, since the matrices in so(n — 1, 1) are not all 
antisymmetric. These matrices, however, represent linear operators, and if we use L 
to convert them into (0, 2) tensors (via #~'), then they are antisymmetric! In other 
words, 


The Lie algebra of a Lie group of metric-preserving operators can always 
be viewed as antisymmetric tensors, 


though we may have to raise or lower an index (via L) to make this manifest. 
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This is actually quite easy to show in coordinates: using the standard basis for R” 
and the corresponding components of X = X, Teige j € g (still viewed as a linear 
operator on R”), we can plug two basis vectors into (5.49) and obtain 


0 = g(Xei, ej) + g(ei, Xe;) 
= X;"g (ekej) + Xj" glei ex) 
= X," grj + X;* gik 
= Xij + Xji 


and so the (2,0) tensor corresponding to X € g is antisymmetric! Oo 


Our last example takes the form of an exercise, in which you will show that 
some of the representations of SU(2) on P;(C7), as described in Example 5.7, are 
equivalent to more basic SU(2) representations. 

Exercise 5.29. Prove (by exhibiting an intertwining map) that (7, Pı (C?)) from Exam- 

ple 5.7 is equivalent to the fundamental representation of su(2). Conclude (since SU(2) is 

connected) that (T11, Pı (C?)) is equivalent to the fundamental representation of SU(2). Do 


the same for (12, P2(C)) and the adjoint representation of su(2) (you will need to consider 
a new basis that consists of complex linear combinations of the elements of 82). 


5.7 Direct Sums and Irreducibility 


One of our goals in this chapter is to organize the various representations we’ve met 
into a coherent scheme, and to see how they are all related. Defining a notion of 
equivalence was the first step, so that we would know when two representations 
are “the same.” With that in place, we would now like to determine all the 
(inequivalent) representations of a given group or Lie algebra. In general this is 
a difficult problem, but for most of the matrix Lie groups we’ve met so far and 
their associated Lie algebras, there is a very nice way to do this: for each group 
or algebra, there exists a denumerable set of inequivalent representations (known 
as the “irreducible” representations) out of which all other representations can be 
built. Once these irreducible representations are known, any other representation 
can be broken down into a kind of “sum” of its irreducible components. In 
this section we’ll present the notions of irreducibility and sum of vector spaces, 
and in subsequent sections we’ll enumerate all the irreducible representations of 
SU(2), SO(3), SO(3, 1), SL(2,C), and their associated Lie algebras. 

To motivate the discussion, consider the vector space M,,(R) of all n x n 
real matrices. There is a representation II of O(n) on this vector space given by 
similarity transformations: 


TI(R)A = RAR! Re O(n), A€ M,(R). 
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If we consider M,,(IR) to be the matrices corresponding to elements of L(R”), then 
this is just the matrix version of the linear operator representation Ty described 
above, with V = R”. Alternatively, it can be viewed as the matrix version of the 
representation TY on R” @ R”, with identical matrix transformation law 


[TIS(R)(T)] = R[T]R? = R|[T]R!, TER" QR”. 


Now, it turns out that there are some special properties that A € M, (R) could have 
that would be preserved by II(R). For instance, if A is symmetric or antisymmetric, 
then RART! is also (you can check this directly, or see it as a corollary of 
the discussion at the beginning of Sect.5.5). Furthermore, if A has zero trace, 
then so does RAR™!. In fact, we can decompose A into a symmetric piece and 
an antisymmetric piece, and then further decompose the symmetric piece into a 
traceless piece and a piece proportional to the identity, as follows: 


A 


1 1 
-(A+A™)4+ =(A— A? 
a + pest ) 


1 ted) I+ z (4 +AT = 2 TrA) 1) + (A—A‘). (5.51) 
n 2 n 2 


II 


(You should check explicitly that the first term in (5.51) is proportional to the 
identity, the second is symmetric and traceless, and the third is antisymmetric.) 
Furthermore, this decomposition is unique, as you will show below. If we recall 
the definitions 


S (R) = {M € M, (R) | M = M7} 
An(R) = {M € M,R) | M = -M"}, 


and add the new definitions 


S! (R) = {M € M,(R)| M = M", TrM = 0} 
RI = {M € M,R)| M =cl, c€ R}, 


then this means that any A € M, (R) can be written uniquely as a sum of elements 
of S’, An, and RZ, all of which are subspaces of M,,(R). This type of situation 
turns up frequently, so we formalize it with the following definition: If V is a vector 
space with subspaces W1, W2,..., Wp such that any v € V can be written uniquely 
as v = wi + w2+---+ wg, where w; E€ W;, then we say that V is the direct sum of 


the W; and we write 


V=WOEMe--Om% o V= QM. 
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The decomposition of a vector v is sometimes written as v = (w1, ..., wx). In the 
situation above we have, as you can check, 


M, (R) = S,(R) $ An (R) = RI @ S,(R) @ A, (R). (5.52) 


k 
Exercise 5.30. Show that if V = EG) W;, then W; N W; = {0} Vi # J, ie. the 
i=1 
intersection of two different W; is just the zero vector. Verify this explicitly in the case 
of the two decompositions in (5.52). 
k 

Exercise 5.31. Show that V = B W; is equivalent to the statement that the set 

i=l 


B =B U---U Bk, 


where each 4; is an arbitary basis for W;, is a basis for V. 


Exercise 5.32. Show that M,(C) = H, (©) ® u(n). 


Our discussion above shows that all the subspaces that appear in (5.52) are 
invariant, in the sense that the action of TI(R) on any element of one of the 
subspaces produces another element of the same subspace (i.e., if A is symmetric 
then II(R)A is also, etc.). In fact, we can define an invariant subspace of a group 
representation (II, V) as a subspace W C V such that II(g)w e W for all 
g € G,w € W. Invariant subspaces of Lie algebra representations are defined 
analogously. Notice that the entire vector space V, as well as the zero vector {0}, 
are always (trivially) invariant subspaces. An invariant subspace W C V that is 
neither equal to V nor to {0} is said to be a nontrivial invariant subspace. Notice 
also that the invariance of W under the II(g) means we can restrict each I1(g) to 
W (that is, interpret each II(g) as an operator II(g)|w € GL(W)) and so obtain a 
representation (II|w, W) of G on W. Given a representation V, the symmetric and 
antisymmetric subspaces S’(V) and A’(V) of the tensor product representation 
T,°(V) are nice examples of nontrivial invariant subspaces (recall, of course, that 
A'(V) is only nontrivial when r < dim V). 

Nontrivial invariant subspaces are very important in representation theory, as they 
allow us to block diagonalize the matrices corresponding to our operators. If we 
have a representation (II, V) of a group G where V decomposes into a direct sum 
of invariant subspaces W;, 


V=W PW 9P- OW, 
and if 5; are bases for W; and we take the union of the $5; as a basis for V, then you 


should check that the matrix representation of the operator TI (g) in this basis will 
look like 
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HT(g)]e, 
Oes = [T12(g)]e, ; (5.53) 


AGIA 


where each [TI (g)]g, is the matrix of TI (g) restricted to the subspace W;. 

Thus, given a finite-dimensional representation (II, V) of a group or Lie algebra, 
we can try to get a handle on it by decomposing V into a direct sum of (two or 
more) invariant subspaces. Each of these invariant subspaces forms a representation 
in its own right, and we could then try to further decompose these representations, 
iterating until we get a decomposition V = W, ®--- ® Wk in which each of the W; 
has no nontrivial invariant subspaces (if they did, we might be able to decompose 
further). These elementary representations play an important role in the theory, as 
we’ll see, and so we give them a name: we say that a representation W that has no 
nontrivial invariant subspaces is an irreducible representation (or irrep, for short). 
Furthermore, if a representation (II, V) admits a decomposition V = W, @--- ® 
W, where each of the W; is irreducible, then we say that (II, V) is completely 
reducible.!! If the decomposition consists of two or more summands, then we say 
that V is decomposable.'* Thus we have the following funny-sounding sentence: if 
(II, V) is completely reducible, then it is either decomposable or irreducible! 

You may be wondering at this point how a finite-dimensional representations 
could not be completely reducible; after all, it is either irreducible or it contains 
a nontrivial invariant subspace W; can’t we then decompose V into W and some 
subspace W’ complementary to W? We can, but the potential problem is that W’ 
may not be invariant; that is, the group or Lie algebra action might take vectors in W’ 
to vectors that don’t lie in W’. For an example of this, see Problem 5-8. However, 
there do exist many groups and Lie algebras for whom every finite-dimensional 
representation is completely reducible. Such groups and Lie algebras are said to be 
semi-simple.'* (Most of the matrix Lie groups we’ve met and their associated Lie 
algebras are semi-simple, but some of the abstract Lie algebras we’ve seen [like 
the Heisenberg algebra], as well as the matrix Lie group U(n), are not). Thus, an 
arbitrary finite-dimensional representation of a semi-simple group or Lie algebra can 
always be written as a direct sum of irreducible representations. (This is what we did 
when we wrote M, (R) as M, (R) = RI @S/ (R) A, (R), though we can’t yet prove 
that the summands are irreducible.) If we know all the irreducible representations of 
a given semi-simple group or Lie algebra, we then have a complete classification 
of all the finite-dimensional representations of that group or algebra, since any 


Note that an irreducible representation is, trivially, completely reducible, since V = V isa 
decomposition into irreducibles. Thus “irreducible” and “completely reducible” are not mutually 
exclusive categories, even if they may sound like it! 


This terminology is not standard but will prove useful. 


'3Semi-simplicity can be defined in a number of equivalent ways, all of which are important. For 
more, see Hall [11] or Varadarajan [20]. 
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representation decomposes into a finite sum of irreps. This makes the determination 
of irreps an important task, which we’ll complete in this chapter for our favorite Lie 
algebras. 

Before moving on to this task, however, there is more to say about decompos- 
able and irreducible representations. We’ll begin with examples of decomposable 
representations, which can arise in a number of ways. One of the most common is 
by taking tensor products. Say we have a semi-simple group or Lie algebra and two 
irreps V; and V2. The tensor product representation V; ® V2 is usually not irreducible, 
but since our group or Lie algebra is semi-simple we can decompose V; & V2 as 


nE nh=W E: B Wk, 


where the W; are irreducible. In fact, the last decomposition in (5.52) is just the 
matrix version of the O(n)-invariant decomposition 


TER") = R” Q R" = Rg 9 S” R") $ AR"), (5.54) 
where 


g= ege; (5.55a) 


S” (R") = {T € S*(R")|6;;T = 0} where 6;; is Kronecker delta. 
(5.55b) 


Another more general instance of the decomposition of a tensor product representa- 
tion into irreducibles, of great interest to physicists, is given by the next example. 


Example 5.23. Decomposition of the tensor product of SU(2) representations, or 
“addition of angular momentum” 


Consider the spin j representation C+! = V; and spin j’ representation C+! = 
Vy. Taking their tensor product yields a decomposable representation, which (as 
mentioned in Example 3.21) decomposes as 


Vi 8 Vy = Vit yi ® Vit ji B+ BVj-j41 8 Vj- (5.56) 


It’s not hard to see why this might be true. Intuitively, adding two angular 
momentum vectors of length j and j’ can only yield vectors with lengths between 
j + jj’ and |j — j'|. We can also sketch a more formal argument as follows. Using 
the notation of Example 3.21, the highest m value in V; ® Vj, must be j + j’ 
with corresponding eigenvector |j} ® |j’), and so Vj; must be a summand 
in the decomposition. The next highest m value is j + j’ — 1, but now there 
are two possible eigenvectors, |j) ® |j’—1) and |j — 1) @ |j’). These span a 
two-dimensional space, one-dimension of which must belong to V;+j; and the 
other to V;+j/-1, so the latter must also be a summand. Considering the next 
highest m value j + j’ — 2 yields a three-dimensional subspace spanned by 
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{li — 2) @17), 17-1) 8 17-1), 17) 8 |j — 2)}, implying that Vj +j-—2 must 
also be a summand. One can continue in this way until m = | j — j’|, at which point 
the m eigenspaces stop growing in dimensions. At this point, however, a simple 
dimension count (see Exercise 5.33 below) confirms that all vectors are accounted 
for and hence that (5.56) is true. 

More detailed proofs of (5.56) can be found in Appendix B of Sakurai [17] and 
Appendix D of Hall [11]. The further process of explicitly decomposing the various 
m eigenspaces into vectors belonging to the different summands in (5.56) is known 
in the physics literature as the “addition of angular momentum,” and is tantamount 
to computing Clebsch—Gordan coefficients. See Sakurai [17] for details. E 


Exercise 5.33. Check that the dimensions on both sides of (5.56) are equal. 


Another source of decomposable representations is to take an irrep of a group G 
and then consider it as a representation of a subgroup H C G. This is the context 
of our next example. 


Example 5.24. Decomposition of A?°R* as an O(3) representation 


Consider the 2nd rank antisymmetric tensor representation of O(3, 1) on A7R¥, but 
restrict the representation to O(3) where we view O(3) C O(3, 1) in the obvious 
way. This representation of O(3) is reducible, since clearly A?°R? C A?R? is an 
O(3)-invariant subspace spanned by 


fi = e2 A€3 
h = 6z Nê (5.57) 
ye =e, Ne. 


There is also a complementary invariant subspace, spanned by 


fa = 21 ex 
fs = er Neg (5.58) 
to = 63 AN eh 


This second subspace is clearly equivalent to the vector representation of O(3) 
since O(3) leaves e4 unaffected (the unconvinced reader can quickly check that 
the map @ : e; A e4 + ei; is an intertwiner). Thus, as an O(3) representation, A?R* 
decomposes into the vector and pseudovector representations, i.e. 


A?R* ~ R? @ APR’, (5.59) 
We can interpret this physically in the case of the electromagnetic field tensor, 


which lives in A7IR*; in that case, (5.59) says that under (proper and improper) 
rotations, some components of the field tensor transform amongst themselves 


5.7 Direct Sums and Irreducibility 235 


as a vector, and others as a pseudovector. The components that transform as 
a vector comprise the electric field, and those that transform like a pseudovector 
comprise the magnetic field. To see this explicitly, one can think about which basis 
vectors from (5.58) and (5.57) go with which components of the matrix in (3.49). 
One could, of course, further restrict this representation to SO(3) C O(3, 1); one 
would get the same decomposition (5.59), except that for SO (3) the representations 
R? and APR? are equivalent, and so under (proper) rotations the field tensor 
transforms as a pair of vectors, the electric field “vector” and the magnetic field 
“vector.” Of course, how these objects transform depends on what transformation 
group you’re considering. The electric field and magnetic field both transform as 
vectors under rotations, but under improper rotations the electric field transforms as 
a vector and the magnetic field as a pseudovector. Furthermore, under Lorentz trans- 
formations the electric and magnetic fields cannot be meaningfully distinguished, as 
they transform together as the components of an antisymmetric second rank tensor! 
Physically, this corresponds to the fact that boosts can turn electric fields in one 
reference frame into magnetic fields in another, and vice-versa. Oo 


Yet another source of decomposable representations are function spaces, as in 
the next example. 


Example 5.25. Decomposition of L? (S?) into irreducibles 


A nice example of a direct sum decomposition of a representation into its irreducible 
components is furnished by L?(S*). As we noted in Example 5.9, the spectral 
theorems of functional analysis tell us that the eigenfunctions y! of the spherical 
laplacian Ayo form an orthogonal basis for the Hilbert space L?(S°). We already 
know, though, that the y! of fixed / form a basis for Åi ; the spherical harmonics of 


degree /. We’ ll show in the next section that each of the H is an irreducible SO(3) 
representation, so we can decompose L? (S°) into irreducible representations as 


LS) = A =ÅŬ PHBH ‘4 
l 


E 

Before concluding this section we should point out that the notion of direct sum 

is useful not only in decomposing a given vector space into mutually exclusive 

subspaces, but also in “adding” vector spaces together. That is, given two vector 

spaces V and W, we can define their direct sum V ® W to be the set V x W with 
vector addition and scalar multiplication defined by 


(vi, w1) + (vi, w2) = (vı + v2, w1 + w2) 
c(v,w) = (cv, cw). 
It’s straightforward to check that with vector addition and scalar multiplication so 


defined, V @ W is a bona fide vector space. Also, V can be considered a subset of 
V @W, as just the set of all vectors of the form (v, 0), and likewise for W. With this 
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identification it’s clear that any element (v, w) € V @ W can be written uniquely as 
v + w with v € V and w € W, so this notion of a direct sum is consistent with our 
earlier definition: if we take the direct sum of V and W, the resulting vector space 
really can be decomposed into the subspaces V and W. 

This notion!* of direct sum may also be extended to representations; that is, 
given two representations (I1;,V;), i = 1,2, we may construct the direct sum 
representation (Il; ® I2, Vi ® V2) defined by 


(CI: $ M2)(g)) (v1, v2) = (M (g)vi, M(g)v) Yv, uE, g €G. 


These constructions may seem trivial, but they have immediate physical application, 
as we’ll now see. 


Example 5.26. The Dirac Spinor 


Consider the left- and right-handed spinor representations of SL (2, C), (II, C?) and 
(II, C?). We define the Dirac spinor representation of SL(2,C) to be the direct 
sum representation (IT @ I, C? @ C’), which is then given by 


(1 @ T1)(A))(v, w) = (Av, A™!w) Y(v,w) e C? @C’, AE SL(2,C). 


Making the obvious identification of C? @ C? with C4, we can write (II @ TI)(A) 
in block matrix form as 


me tya=(4 aa) (5.60) 


as in (5.53). 

This representation is, by construction, decomposable. Why deal with a decom- 
posable representation rather than its irreducible components? There are a few 
different ways to answer this in the case of the Dirac spinor, but one rough 
answer has to do with parity. We will show in Sect.5.12 that it is impossible 
to define a consistent action of the parity operator on either the left-handed or 
right-handed spinors individually, and that what the parity operator naturally wants 
to do is interchange the two representations. Thus, to have a representation of 
SL(2,C) spinors on which parity naturally acts, we must combine both the left- 
and right-handed spinors into a Dirac spinor, and in the most natural cases parity is 
represented by 


(II ® I1)(Parity) = + (; a) (5.61) 


'4Tn some texts our first notion of direct sum, in which we decompose a vector space into mutually 
exclusive subspaces, is called an internal direct sum, and our second notion of direct sum, in which 
we take distinct vector spaces and add them together, is known as an external direct sum. 
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which obviously just interchanges the left- and right-handed spinors. From this we 
see that the Dirac spinor is reducible under SL(2, C), but not under larger groups 
which include parity. 

In addition to the Weyl and Dirac spinors you may have heard of the Majorana 
spinor, which is a real version of the Dirac representation. One way to obtain the 
Majorana representation is to perform a similarity transformation on (5.60) which 
produces a purely real matrix. This is certainly not possible for general complex 
matrices, but can done for matrices in the Dirac spinor representation. The first step 
is to use the similarity transformation from Example 5.21 and Exercise 5.28 on the 
second block to turn iG ne ) into k g) Any matrix of this form can then 
be transformed into a purely real matrix, as you will show in Problem 5-6. With 
a purely real matrix in hand we are then free to restrict our vector components 
to be real, yielding a representation of SL(2,C) on R4. This is the Majorana 
representation. You will compute the induced Lie algebra representation explicitly 
in Problem 5-6(c). 


5.8 More on Irreducibility 


In the last section we introduced the notion of an irreducible representation, but we 
didn’t prove that any of the representations we’ve met are irreducible. In this section 
we'll remedy that and also learn a bit more about irreducibility along the way. In 
proving the irreducibility of a given representation, the following proposition about 
the irreps of matrix Lie groups and their Lie algebras is often useful. The proposition 
just says that for a connected matrix Lie group, the irreps of the group are the same 
as the irreps of the Lie algebra. This may seem unsurprising and perhaps even trivial, 
but the conclusion does not hold when the group is disconnected. We’ll have more 
to say about this later. The proof of this proposition is also a nice exercise in using 
some of the machinery we’ve developed so far. 


Proposition 5.4. A representation (TI, V) of a connected matrix Lie group G is 
irreducible if and only if the induced Lie algebra representation (n, V) of g is 
irreducible as well. 


Proof. First, assume that (II, V) is an irrep of G. Then consider the induced Lie 
algebra representation (7, V), and suppose that this representation has an invariant 
subspace W. We’ll show that W must be an invariant subspace of (II, V) as well, 
which by the irreducibility of (I1, V) will imply that W is either V or {0}, which 
will then show that (x, V) is irreducible. For all X € g, w € W, we have 


m(X)weW 
> Owe W 
> Il (e¥)w € W. (5.62) 
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Now, using the fact that any element of G can be written as a product of exponentials 
(cf. Proposition 5.3), we then have 


II(g)w = TI (e"*! e2% grg et Xn)w 


= M(e"™*')T1(e?*)-.- H(e"*")w 


which must be in W by repeated application of (5.62). Thus W is also an invariant 
subspace of (II, V), but we assumed (II, V) was irreducible and so W must be 
equal to V or {0}, which then proves that (7, V) is irreducible as well. 

Conversely, assume that (7r, V) is irreducible, and let W be an invariant subspace 
of (O, V). Then for allt € R, X € g, w € W we know that I(e’*)w € W, and 
hence 


d 
m(X)w = g Te Ow l= 


_ T(e*)w-—w 
= lim ——_ 
t—>0 t 


must be in W as well, so W is invariant under m and hence must be equal to V 
or {0}. Thus (TI, V) is irreducible. Oo 


Note that the assumption of connectedness was crucial in the above proof, as 
otherwise we could not have used Proposition 5.3. Furthermore, we’ll soon meet 
irreducible representations of disconnected groups (like O(3,1)) that yield Lie 
algebra representations which do have nontrivial invariant subspaces, and are thus 
not irreducible. The above proposition is still very useful, however, as we’ll see in 
this next example. 


Example 5.27. The SU(2) representation on P; (C?), revisited 


In this example we’ll prove that the su(2) representations (2, P;(C*)) of Exam- 
ple 5.7 are all irreducible. Proposition 5.4 will then tell us that the SU(2) 
representations (IT;, P; (C?)) are all irreducible as well. Later on, we’ll see that these 
representations are in fact all the finite-dimensional irreducible representations of 
SU(2)! 

Recall that the z;(S;) are given by 


7 (Sı) = 


7 (S27) = 


i ð ð 
7) (S3) = 2 (ag -ag-). 
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Define now the “raising” and “lowering” operators 


ð 
Yı = m (S2) + im (Si) = =z 5 — 
Z2 


0 
X, = —1) (S2) + im (S1) = -az 
zı 


It’s easy to see that Y; trades a factor of z2 for a factor of zı, thereby raising the 
zı (S3) eigenvalue of a single term by i. Likewise, X; trades a z; for a z2 and lowers 
the eigenvalue by i.'° Consider a nonzero invariant subspace W C P;(C?). If we can 
show that W = P;(C?), then we know that P; (C?) is irreducible. Being nonzero, 
W contains at least one element of the form 


w= az) tard te + aaz + + ad, 


where at least one of the ax is not zero. Let ko be the biggest value of k such that 
da, is nonzero, so that apace, is the term in w with the highest power of zı. 
Then applying (X7) to w lowers the zı degree by ko, killing all the terms except 


Ce In fact, one can compute easily that 
ko ko I—-koy — k l 
XP? (akozi Z3 °) = (—1)® ko! dkozz- 


This is proportional to A and since W is invariant W must then contain z}. But 
then we can successively apply the raising operator Y; to get monomials of the form 
a a for 0 < k < l, and so these must be in W as well. These, however, form a 


basis for P;(C*), hence W must equal P; (C?), and so P; (C?) is irreducible. Oo 


Before moving on to further examples, we need to state and prove Schur’s lemma, 
one of the most basic and crucial facts about irreducible representations. Roughly 
speaking, the upshot of it is that if a linear operator on the representation space of 
an irrep commutes with the group action (i.e., is an intertwiner), then it must be 
proportional to the identity. The precise statement is as follows: 


Proposition 5.5 (Schur’s Lemma). Let (I1;,V;), i = 1,2 be two irreducible 
representations of a group or Lie algebra, and let @ : Vi —> Vz be an intertwiner. 
Then either @ = 0 or ¢ is a vector space isomorphism. Furthermore, if (11,, Vi) = 
(Il, V2) and V; is a finite-dimensional complex vector space, then ¢ is a multiple 
of the identity, i.e. 6 = c I for some c € C. 


'SYou may object to the use of 7 in our definition of these operators; after all, su(2) is a real Lie 
algebra, and so the expression S7+7S, has no meaning as an element of su(2), and so one can’t say 
that, for instance, Y; = z (S2 +7S,). Thus X; and Y; are not in the image of su(2) under z;. This is 
a valid objection, and to deal with it one must introduce the notion of the complexification of a Lie 
algebra. A discussion of this here would lead us too far astray from our main goals of applications 
in physics, however, so we relegate this material to the appendix, which you can consult at leisure. 
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Proof. We’ll prove this for group representations IT; ; the Lie algebra case follows 
immediately with the obvious notational changes. Let K be the kernel or null space 
of ġ. Then K C V; is an invariant subspace of I], since for any v € K, g € G, 


p(li(g)v) = Ta(g)(o(v)) 
= M(g)(0) 
= 0 
=> I(g)v € K. 


However, since (I1,, V;) is irreducible, the only invariant subspaces are 0 and Vj, 
so K must be one of those. If K = Vi, then ø = 0, so henceforth we assume 
that K = 0, which means that ġ is one-to-one, and so @(V,) C Vz is isomorphic 
to V;. Furthermore, #(V)) is an invariant subspace of (I2, V2), since for any ¢ (v) € 
o(Vi), g € G we have 


(TIo(g))(@(v) = plev) € V). 


But (I2, V2) is also irreducible, so (V1) must equal 0 or V2. We already assumed 
that (V1) # 0, though, so we conclude that (V1) = V2 and hence ¢ is an 
isomorphism. 

Now assume that (M1, V) = (:, V2) = (TI,V) and that V is a finite- 
dimensional complex vector space of dimension n (notice that we didn’t assume 
finite-dimensionality at the outset of the proof). Then ¢ is a linear operator and the 
eigenvalue equation 


det(# — AI) = 0 


is an nth degree complex polynomial in A. By the fundamental theorem of algebra,!° 
this polynomial has at least one root c € C, hence ¢ — cI has determinant 0 and is 
thus noninvertible. This means that @¢ — c I has a nontrivial kernel K. K is invariant, 
though; for all v € K, 

($ —cI)(II(g)v) = P(II(g)v) — cII(g)v 
TI(g)(P(v) — cv) 
=0 
> I(g)ue K 


II 


and since we know K # 0, we then conclude by irreducibility of V that K = V. 
This means that (ġ — cI )(v) = 0 Y v € V, which means ġ = cI, as desired. O 


l6See Herstein [12], for instance. 
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Exercise 5.34. Prove the following corollary of Schur’s lemma: Jf (T1;, V;), i = 1,2 are 
two complex irreducible representations of a group or Lie algebra, and ¢, : Vi > Vz 
are two intertwiners with ġ # 0, then Yy = cd for some c € C. 


Before applying Schur’s lemma, we should note that it embodies the connection 
between symmetry and degeneracy that is often mentioned in quantum mechanics 
texts. If (II, V) is an irrep of some symmetry group G and H e GL(V) is an 
intertwiner (i.e., commutes with II(g) for all g € G), then Schur’s lemma says 
that H = cl. This means that all vectors in V have the same H-eigenvalue, i.e. 
they are degenerate. Thus if (II, V) is the angular momentum j representation of 
G = SU(2) and H is a quantum-mechanical Hamiltonian, then this means that all 
the spin j states will have the same energy, so there will be a dim V = (27 + 1)-fold 
degeneracy. 

It should be noted, however, that symmetry does not always imply degeneracy. 
For instance, the double delta function potential well problem (see, e.g., Gasiorow- 
icz [7]) has a parity-symmetric Hamiltonian (i.e., [H, P] = 0) but only two energy 
eigenfunctions, with differing energies. The fact that parity symmetry does not 
imply degeneracy can be seen as a consequence of the following proposition, which 
is our first application of Schur’s lemma: 


Proposition 5.6. An irreducible finite-dimensional complex representation of an 
abelian group or Lie algebra is one-dimensional. 


Proof. Again we prove only the group case. Since G is abelian, each II(g) 
commutes with II(/) for all h € G, hence each II(g) : V — V is an intertwiner! 
By Schur’s lemma, this implies that every II(g) is proportional to the identity 
(with possibly varying proportionality constants), and so every subspace of V is 
an invariant one. Thus the only way V could have no nontrivial invariant subspaces 
is to have no nontrivial subspaces at all, which means it must be one-dimensional. L 


Exercise 5.35. Show that the fundamental representation of SO(2) on R? is irreducible. 
Prove this by contradiction, showing that if the fundamental representation were reducible 


then the SO(2) generator 
0-1 
Co 


would be diagonalizable over the real numbers, which you should show it is not. This shows 
that one really needs the hypothesis of a complex vector space in the above proposition. 


Example 5.28. The irreducible representations of Z2 


Proposition 5.6 allows us to easily enumerate all the irreducible representations 
of Z2. Since Z3 is abelian any irreducible representation (Iir, V) must be one- 
dimensional (i.e., V = R or C), and Iir must also satisfy 


(Min(—1))* = Tin ((-1)?) = Thin (1) = 1, 
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which means that IT;,(—1) = +1, ie. Tin is either the alternating representation 
or the trivial representation! Furthermore, Z is semi-simple (as you will show 
in Problem 5-9), and so any representation (II, V) of Z is completely reducible 
and thus decomposes into one-dimensional irreducible subspaces, on which I1(—1) 
equals either 1 or —1. Thus II(—1) is diagonalizable, with eigenvalues +1 (cf. 
Example 5.18 for examples of this). Let’s say the Zz in question is Z2 ~ {J, P} C 
O(3), and that we’re working in a quantum-mechanical context with a Hilbert space 
H. Then there exists a basis for H consisting of eigenvectors of II(P); for a given 
eigenvector y, its eigenvalue of +1 is known as its parity. If the eigenvalue is +1, 
then w is said to have even parity, and if the eigenvalue is —1, then y is said to 
have odd parity. If [H, TI(P)] = 0, then the energy eigenfunctions can be taken 
to be parity eigenvectors, but as mentioned above this does not necessarily imply 
degeneracy of the energy eigenvalues. Oo 


5.9 The Irreducible Representations of su(2), SU(2), 
and SO(3) 


In this section we’ll construct (up to equivalence) all the finite-dimensional 
irreducible complex representations of su(2). Besides being of intrinsic interest, 
our results will also allow us to classify all the irreducible representations 
of SO(3), SU(2), and even the apparently unrelated representations of 
s0(3,1), SO(3,1), and SL(2,C). The construction we’ll give is more or less 
the same as that found in the physics literature under the heading “theory of angular 
momentum,” except that we’re using different language and notation. Our strategy 
will be to use the commutation relations to deduce the possible structures of su(2) 
irreps, and then show that we’ve already constructed representations which exhaust 
these possibilities, thus yielding a complete classification. 

Let (x, V) be a finite-dimensional complex irreducible representation of su(2). 
It will be convenient to use the following shorthand, familiar from the physics 
literature: 


J, = in(S;) 
Jy =in(S,) — 2(Sy) (5.63) 
J- = in(Sy) + m(Sy). 


These “raising” and “lowering” operators obey the following commutation relations, 
as you can check: 


(J, J4] = tJ. 
[J4, J_] = 2J,. 
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Now, as discussed in our proof of Schur’s lemma, the fact that V is complex means 
that every operator on V has at least one eigenvector. In particular, this means that 
J, has an eigenvector v with eigenvalue b. The above commutation relations then 
imply that 


J (Jv) = [Jz, J+] v + Je(Jzv) = (b+ 1)J+ v 


so that if Jv is not zero (which it might be!), then it is another eigenvector of J, 
with eigenvalue b+ 1. Now, we can repeatedly apply J+ to v to yield more and more 
eigenvectors of J,, but since V is finite-dimensional and eigenvectors with different 
eigenvalues are linearly independent (see Exercise 5.36 below), this process must 
end somewhere, say at N applications of J+. Let vo be this vector with the highest 
eigenvalue (also known as the highest weight vector of the representation), so that 
we have 


vo = (J4)% 0 
J4 vo = 0. 
Then vo has a J, eigenvalue of b + N = j (note that so far we haven’t 
proved anything about b or j, but we will soon see that they must be integral or 
half-integral). Starting with vo, then, we can repeatedly “lower” with J_ to get 
eigenvectors with lower eigenvalues. In fact, we can define 


vk = (J-)* vo 


which has J, eigenvalue j — k. This chain must also end, though, so there must exist 
an integer / such that 


vı = (J) vo # 0 but v4) = (J_)'*!up = 0. 


How can we find / ? For this we’ll need the following formula, which you will prove 
in Exercise 5.37 below: 


J+ (vk) = [27k — k(k — 1)] ve-1. (5.64) 
Applying this to vj; = 0 gives 
0= J401) = BIE +1)- 0 +1)]v 
and since v; # 0 we conclude that 


2id+)-Cd+D)=0 — jll. 
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Thus j is a nonnegative integer or half integer! (In fact, j is just the “spin” of the 
representation.) We further conclude that V contains 27 + 1 vectors {v |0 < k < 
2 7}, all of which are eigenvectors of J, with eigenvalue j — k. Furthermore, since 


a(S) =— 54 +L) 
1 
a(S) = (J), 


the action of S, and S, take a given v, into a linear combination of other vg, so 
the span of v+ is a nonzero invariant subspace of V. We assumed V was irreducible, 
however, so we must have V = Span {v+}, and since the v, are linearly independent, 
they form a basis for V! 

To summarize, any finite-dimensional irreducible complex representation of 
su(2) has dimension 2j + 1, 2j € N, anda basis {v,};—0-2; which satisfies 


J+(vo) = 0. 

J-V) = v+ k<2j 

Ja(vk) = (j — k)vk (5.65) 
J_(v2;) = 0 


J+ (vg) = [jk —k(k — D]vx-ı k £0. 


What’s more, we can actually use the above equations to define representations 
(xj, V;), where V; is a 2j + 1 dimensional vector space with basis {vx }k=0-2j 
and the action of the operators z;(S;) is defined by (5.65). It’s straightforward to 
check that this defines a representation of su(2) (see exercise below), and one can 
prove irreducibility in the same way that we did in Example 5.27. Furthermore, any 
irrep (x, V) of su(2) must be equivalent to (z;, Vj) for some j, since we can find a 
basis wx for V satisfying (5.65) for some j and then define an intertwiner by 


ġ:V >V; 


Wk > Uk 


and extending linearly. We have thus proved the following: 


Proposition 5.7. The su(2) representations (7; , Vj), 2j € N defined above are all 
irreducible, and any other finite-dimensional complex irreducible representation of 
su(2) is equivalent to (xj, Vj) for some j, 2j € N. 


In other words, the (7x; , Vj) are, up to equivalence, all the finite-dimensional com- 
plex irreducible representations of su(2). They are also all the finite-dimensional 
complex irreducible representations of s0(3), since su(2) ~ s0(3). 
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If you look back over our arguments you'll see that we deduced (5.65) from just 
the su(2) commutation relations, the finite-dimensionality of V, and the existence 
of a highest weight vector vo satisfying J+(vo) = 0 and J,(vo) = jvo. Thus, 
if we have an arbitrary (i.e., not necessarily irreducible) finite-dimensional su(2) 
representation (7x, V) and can find a highest weight vector vo for some j, we can 
lower with J_ to generate a basis {vx }x=0-2; satisfying (5.65) and conclude that 
V has an invariant subspace equivalent to (7x; , Vj). We can then repeat this until V 
is completely decomposed into irreps. If we know that (7x, V) is irreducible from 
the start, then we don’t even have to find the vector vo, we just use the fact that 
(x, V) must be equivalent to (7;, Vj) for some j and note that j is given by j = 
(dim V — 1). These observations make it easy to identify which (7; , V;) occur in 
any given su(2) representation. 

Note that if we have a finite-dimensional complex irreducible su(2) representa- 
tion that is also unitary, then we could work in an orthonormal basis. In that case, 
it turns out that the v defined above are not orthonormal, and are thus not ideal 
basis vectors to work with. They are orthogonal, but are not normalized to have 
unit length. In quantum-mechanical contexts the su(2) representations usually are 
unitary, and so in that setting one works with the orthonormal basis vectors |m), 
—j <m < j. The vector |m) is proportional to our vj—m, but is normalized. See 
Sakurai [17] for details on the normalization procedure. 


Exercise 5.36. Let S = {v;};—,—; be a set of eigenvectors of some linear operator T on 
a vector space V. Show that if each of the v; has distinct eigenvalues, then S is a linearly 
independent set. (Hint: One way to do this is by induction on k. Another is to argue by 
assuming that S is linearly dependent and reaching a contradiction. In this case you may 
assume without loss of generality that the v;, 1 < i < k — 1 are linearly independent, so 
that vz is the vector that spoils the assumed linear independence.) 


Exercise 5.37. Prove (5.64). Proceed by induction, i.e. first prove the formula for k = 1, 
then assume it is true for k and show that it must be true for k + 1. 


Exercise 5.38. Show that (5.65) defines a representation of su(2). This consists of showing 
that the operators J+, J_, J; satisfy the appropriate commutation relations. Then show that 
this representation is irreducible, using an argument similar to the one from Example 5.27. 


Example 5.29. P;(C’), revisited again 


In Example 5.7 we met the representations (71, P; (C?)) of su(2) on the space of 
degree / polynomials in two complex variables. In Example 5.27 we saw that these 
representations are all irreducible, and so by setting dim P; (C?) = / + 1 equal to 
2j + 1 we deduce that 


(m1, P(C) = (11/2, Vi/2) (5.66) 


and so the (2), P;(C’)), 1 e N also yield all the complex finite-dimensional 
irreps of su(2). What’s more, this allows us to enumerate all the finite-dimensional 
complex irreps of the associated group SU(2). Any irrep (II, V) of SU(2) yields 
an irrep (x, V) of su(2), by Proposition 5.4. This irrep must be equivalent to 
(xı, Pi(C?)) for some | e N, however, and so by Proposition 5.3 (I, V) is 
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equivalent to (I1;, P;(C7)). Thus, the representations (IT), P; (C?)), / € N are (up 
to equivalence) all the finite-dimensional complex irreducible representations 
of SU(2)! 

It’s instructive to construct the equivalence (5.66) explicitly. Recall that the 
raising and lowering operators (which we called Y; and X; in Example 5.27) are 
given by 


T= ð 
ei: a) 

ð 
Joas =p 
+ z 


and that 


1 ð a 
J. = im(S) = 5 (az -az) . 


It’s easy to check that vo = z is a highest weight vector with j = //2, and so the 
basis that satisfies (5.65) is given by 


vk = (J) (2) = (-1* wT Tots" O<k <i. (5.67) 


Exercise 5.39. Show by direct calculation that (7r2;, P2;(C7)) satisfies (5.65) with basis 
vectors given by (5.67). 


Example 5.30. S” (C?) as irreps of su(2) 


We suggested in Example 5.15 that the tensor product representation of su(2) 
on S7/(C?), the totally symmetric (0,27) tensors on C?, is equivalent to 
(2;, Pa; (C) > (x;,V;), 27 € N. With the classification of su(2) irreps in 
place, we can now prove this fact. It’s easy to verify that 


2j 2 
Vo =e 89: 8e E SY (C^ 
——$——— 
2j times 


is a highest weight vector with eigenvalue j, so there is an irreducible invariant 
subspace of S°% (C?) equivalent to V;. Since dim V; = dim S” (C?) = 2j + 1 (as 
you can check), we conclude that V; is all of S” (C°), and hence that 


(Sx, SCD = (aj, Vj) = (a2, Poj(C)) 


which also implies (S% TI, S% (C>) ~ (II2;, P2; (C”)). Thus, we see that: 


Every irreducible representation of su(2) and SU(2) can be obtained by 
taking a symmetric tensor product of the fundamental representation. 
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Thus, using nothing more than the fundamental representation (which corresponds 


to j = 1/2, as expected) and the tensor product, we can generate all the 
representations of su(2) and SU(2). We’ll see in the next section that the same 
is true for s0(3, 1) and SO(3, 1)o. Oo 


Our results about the finite-dimensional irreps of su(2) and SU(2) are not only 
interesting in their own right; they also allow us to determine all the irreps of 
SO(3)! To see this, first consider the degree / harmonic polynomial representation 
(11;, H)(IR*)) of SO(3). This induces a representation (x, Hı (R?)) of s0(3) ~ 
su(2). It’s easy to check (see exercise below) that if we define, in analogy to the 


su(2) case, 
im(L:) =i (> = s5) 
x 


, ð > ð 
= j Èx — Ly =] — — — — — — 7— 
a ae (e rz) (i a) 


ð ð ð ð 
_=i Èx Ly =i — — y— — — 7— 
n es) aE y) IE rz) + (9p x) 


then the vector fo = (x + iy)! is a highest weight vector with eigenvalue /, and so 
H,(IR*) has an invariant subspace equivalent to (7, Vi). We will argue in Problem 
5-10 that dim H; (R?) = 2/ + 1, so we conclude that (7, Hı (@R3)) ~ (a7, Vi), and 
is hence an irrep of s0(3). Proposition 5.4 then implies that (IT;, H; (R°)) is an irrep 
of SO(3)! 

Are these all the irreps of SO(3)? To find out, let (II, V) be an arbitrary 
finite-dimensional complex irrep of SO(3). Then the induced s0(3) ~ su(2) 
representation (x, V) must be equivalent to (7;,V;) for some integral or half- 
integral j. We just saw that any integral j value is possible, by taking (II, V) = 
(I1;,H; (R?)). What about half-integral values of j? In this case, we have (careful 
not to confuse the number z with the representation zr!) 


Jz 


er TL) y = eiT yg 
= e“ yy since Uo has eigenvalue j 
= —vo since j half-integral. 


However, we also have 


erry = (Teto 
= (II(1))vp by (4.72) 


= v0, 
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a contradiction. Thus, j cannot be half-integral, so (x, V) must be equivalent to 
(x;,V;) for j integral. But this implies that (7,V) ~ (xj, Hj (IR*)), and so by 
Proposition 5.3, (II, V) is equivalent to (I1;, Hj (IR*))! Thus: 


The representations (I1;, H; (R?)), j € N are (up to equivalence) all the 
finite-dimensional complex irreducible representations of SO(3). 


An important lesson to take away from this is that for a matrix Lie group G with 
Lie algebra g, not all representations of g necessarily come from representations 
of G. If G = SU(2), then there is a one-to-one correspondence between Lie algebra 
representations and group representations, but in the case of SO(3) there are Lie 
algebra representations (corresponding to half-integral values of j) that don’t come 
from SO(3) representations. We won’t say much more about this here, except to 
note that this is connected to the fact that there are non-isomorphic matrix Lie groups 
that have isomorphic Lie algebras, as is the case with SU(2) and SO(3). For a more 
complete discussion, see Hall [11]. 


Exercise 5.40. Verify that J+ fo = 0 and Jz fo = fo. 


5.10 Real Representations and Complexifications 


So far we have classified all the complex finite-dimensional irreps of su(2), SU(2), 
and SO(3), but we haven’t said anything about real representations, despite the fact 
that many of the most basic representations of these groups and Lie algebras (like 
the fundamental of SO(3) and all its various tensor products) are real. Fortunately, 
there is a way to turn every real representation into a complex representation, so that 
we can then apply our classification of complex irreps. Given any real vector space 
V, we can define the complexification of V as Vc = C & V, where C and V are 
both thought of as real vector spaces (C being a two-dimensional real vector space 
with basis {1,7}), so that if {e;} is a basis for V then {1 @ e;, i ® e;} is a (real) 
basis for Vc. Note that Vc also carries the structure of a complex vector space, with 
multiplication by i defined by 


i(z&v)= (iz) &v, zEC, vev. 
A complex basis for Vc is then given by {1 ® e;}, and the complex dimension of Vc 
is equal to the real dimension of V. 


We can then define the complexification of a real representation (II, V) to be the 
(complex) representation (IIc, Vc) defined by 


(UIc(g))(z 8 v) = z 8 I(g)v. (5.68) 


We can then get a handle on the real representation (II, V) by applying our 
classification scheme to its complexification (IIc, Vc). You should be aware, 
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however, that the irreducibility of (II, V) does not guarantee the irreducibility of 
(IIc, Vc) (see Exercise 5.42 and Example 5.39), though in many cases (IIc, Vc) 
will end up being irreducible. 


Example 5.31. The complexifications of R” and M, (R) 


As a warm-up to considering complexifications of representations, we consider the 
complexification of a simple vector space. Consider the complexification Rọ of R”. 
We can define the (obvious) complex-linear map 


o: Rg > C” 
18 (atej) +i @ (b/e;) (at +ib/)e;, ai,bi ER 


which is easily seen to be a vector space isomorphism, so we can identify Rọ with 
C”. One can also extend this argument in the obvious way to show that 


Example 5.32. The fundamental representation of s0(3) 


Now consider the complexification (ac, Rè) of the fundamental representation 
of so(3). As explained above, RÈ can be identified with C?, and a moment’s 
consideration of (5.68) will show that the complexification of the fundamental 
representation of s0(3) is just given by the usual s0(3) matrices acting on C? rather 
than R°. You should check that J, and J+ are given by 


0-7 0 00-1 
Jz=1i00], J, =| 00-i (5.69) 
000 li 0 
and that the vector 
1 
vo =eytie= fi 
0 


is a highest weight vector with j = 1. This, along with the fact that dim C? = 3, 
allows us to conclude that (zc, Ri) ~ (mı, Vı). This is why regular three- 
dimensional vectors are said to be “spin-one.” Notice that our highest weight vector 
vo = eı + iez is just the analog of the function fọ = x + iy € Hı (R°), which is an 
equivalent representation, and that in terms of SO(3) reps we have 


(Tc, Rg) = (1, Hı (R’)). 
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Recall also that the adjoint representation of su(2) ~ s0(3) is equivalent to the 
fundamental representation of s0(3). This combined with the above results implies 


(adc, $0(3)c) = (ade, su(2)c) > (1, Vi) 


so the adjoint representation of su(2) is “spin-one” as well. 
Example 5.33. Symmetric traceless tensors 


Consider the space S? (R°) of symmetric traceless 2nd rank tensors defined 
in (5.55b). This is an SO(3) invariant subspace of R? @ R°, and so furnishes a repre- 
sentation (IT°, S” (R?)) of SO(3). What representation is this? To find out, consider 
the complexification of the associated s0(3) representation, Goes S” (C?)), where 
S” (C?) is also defined by (5.55b), just with C? replacing R3. Then it’s straightfor- 
ward to verify that 


vo = (e; + fez) Q (e1 +ie2) (5.70) 


is a highest weight vector with j = 2, and this, along with fact that dim S? (C?) = 5 
(check!), implies that 


(ahe, S” (C) & (m, Va). 
This is why symmetric traceless 2nd rank tensors on R? are sometimes said to be 


“spin-two.” 
Recall that S?” (R3) was defined in (5.55b) as part of the decomposition 


R? @ R? = Rg @ A?(R*) @ S? (R?) 
which has matrix counterpart 
M;(R) = RI @ A3(R) @ S3(R). 


Complexifying and using the fact (cf. Example 5.22) that A?(R?) is equivalent 
to the adjoint representation and is hence “spin-one,” we obtain the following 
decomposition: 


Vi 8 Vi >C 8C S~ MO Sc nh eV @ Vr. 


This is an instance of (5.56) and should be familiar from angular momentum 
addition in quantum mechanics. 
Exercise 5.41. Using the standard basis for C3, write down the matrix [vo] for vo = (e1 + 


ier) ® (e; + ie2). Then use the appropriate matrix transformation law to show that [vo] is 
an eigenvector of i Tye (L,) with eigenvalue 2. Equation (5.69) may come in handy here. 
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Exercise 5.42. Consider the fundamental representation of SO(2) on R?, which we know 
is irreducible by Exercise 5.35. The complexification of this representation is just given 
by the same SO(2) matrices acting on C? rather than R?. Show that this representation is 
reducible, by diagonalizing the SO(2) generator 


0-1 
x= s 
1 0 
Note that this diagonalization can now be done because both complex eigenvalues and 
complex basis transformations are allowed, in contrast to the real case. 


5.11 The Irreducible Representations of s1(2,C)r, SL(2, C), 
and SO(3, 1), 


In this section we’ll use the techniques and results of the previous section to classify 
all the finite-dimensional complex irreps of s{(2,C)p ~ so(3, 1), and then use this 
to find the irreps of the associated groups SL(2, C) and SO(3, 1)o. 

Let (x,V) be a finite-dimensional complex irreducible representation of 
sl(2, C)r. Define the operators 


TaS) — ix(Ki)) i 
(5.71) 


TaS) tir) i 


where {S;, K; };=1,2,3 is our usual basis for s{(2, C)g. One can check that the M s and 
Ns commute between each other, as well as satisfy the su(2) commutation relations 
internally, i.e. 


[M;, Nj] = 0 
3 
[M;, Mj] = J ijn Mx (5.72) 
k=1 
3 
[Ni Nj = Sein Ne. 
k=1 


We have thus taken the complex span of the set {1(K;), z (S;)y [notice the factors 
of i in (5.71)] and found a new basis for this Lie algebra of operators that makes it 
look like two commuting copies of su(2). We can thus define the usual raising and 
lowering operators 


z 
l 


=i1N, FM 
M+ = iM, F M2 
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which then have the usual commutation relations between themselves and i M,, i N-: 


[iM., Mz] = +M; 
(iN. N+] = +N4 
[M+, M_] = 2iM. 
[N1, N_] = 2iN.. 


With this machinery set up we can now use the strategy from the last section. First, 
pick a vector v € V that is an eigenvector of both i M, and iN, (that such a vector 
exists is guaranteed by Problem 5-11). Then by applying M and N we can raise 
the iM, and i N, eigenvalues until the raising operators give us the zero vector; let 
vo, denote the vector with the highest eigenvalues (which we’ll again refer to as a 
“highest weight vector”), and let (j1, j2) denote those eigenvalues under i M, and 
i N, respectively, so that 


M4(v00) = 0 

N4 (voo) =0 

l (5.73) 
iM-(v00) = jı Voo 


iN-(v0,0) = j2 V00 - 
We can then lower the eigenvalues with N_ and M- to get vectors 
Vid = (M-)" (NY? v9.0 
which are eigenvectors of iM, and iN, with eigenvalues jı — kı and jọ — k2 


respectively. By finite-dimensionality of V this chain of vectors must eventually 
end, though, so there exists nonnegative integers /,,/ such that uv), ;, 4 0 but 


M_(v7, 15) = N_(vj,,15) = 0. 


Calculations identical to those from the su(2) case show that /; = 2j;, and that the 
action of the operators M;, N; is given by 


M+ (voo) = 0 
N4 (voo) =0 


M-(Vkik) = Va+ik kı<2jı 

N-(Uki k) = Vkvkgti k2 < 2j2 

i M:(Vk ka) = (Ji — ki) Vki ka 

i N:(Uki ko) = (J2 — k2) Vki kz (5.74) 
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M_(v2j,,4) = 0 V ko 
N—-(Up,2j.) =O Viki 
M4 (vkika) = Bjiki — ki (ki — Dyve-1k ki #0 
N4 (Vkit) = [2jok2 — koko — 1)]vki k1 ko FO 


(this is just two copies of (5.65), one each for the M; and the N;). As in the 
su(2) case, we note that the {vk;k} are linearly independent and span an invariant 
subspace of V, hence must span all of V since we assumed V was irreducible. Thus, 
we conclude that any complex finite-dimensional irrep of sI(2, C)g is of the form 
(Gija) VGi.ja)) where V(j,,;.) has a basis 


B = {vk [0 < ki < 2ji, 0 < k2 < 2jo} 


and the operators 7j; .ja)(S$;), Tj; j) (Ki) satisfy (5.74), with 2j1,2j2 € N. This 
tells us that 


dim Vij...) = 2/1 + Qj + 1). 


Let’s abbreviate the representations (7 ;,,;,), Vga, ja)) as simply (j1, j2), as is done 
in the physics literature. As in the su(2) case, we can show that (5.74) actually 
defines a representation of sI(2, C)g, and using the same arguments that we did in 
the su(2) case we conclude that 


Proposition 5.8. The representations (jı, j2), 2j1, 2j2 € N are, up to equivalence, 
all the complex finite-dimensional irreducible representations of s\(2, C)p. 


As in the su(2) case, we deduced (5.74) from just the sl(2,C)pz commutation 
relations, the finite-dimensionality of V, and the existence of a highest weight 
vector vo, satisfying (5.73). Thus if we’re given a finite-dimensional sl(2,C)g 
representation and can find a highest weight vector vo, for some (ji, j2), we can 
conclude that the representation space contains an invariant subspace equivalent to 
(j1. j2). 

Example 5.34. (x, C°) The fundamental (left-handed spinor) representation 


Consider the left-handed spinor representation (7r, C?) of s{(2,C), which is also 
just the fundamental of sI(2, C)k, i.e. 


ms=5=3(2 7). 
(0) 
107’ 
(ae) 
2\0i 


a(S) = S2 


II 


NI = 


m(S3) = S3 


II 
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m(Ki) = Ki = io 
m(Ky) = K = Cree 


In this case we have (check!) 


and hence 


With this in hand you can easily check that (1,0) € C? is a highest weight vector 
with jı = 1/2, jo = 0. Thus the fundamental representation of s{(2, C)g is just 


1 
(3,0). 
Example 5.35. (S? z, S” (C?)) Symmetric tensor products of left-handed spinors 


As with su(2), we can build other irreps by taking symmetric tensor products. 
Consider the 2th symmetric tensor product representation (S~/ x, S% (C?)), 2j € 
N and the vector 


2j (qn2 
voo = €1 8: 8e E SY (C3). 
Ta 
2j times 


Using (5.25) it’s straightforward to check, as you did in Example 5.30, that this is 
a highest weight vector with eigenvalue (j, 0), and so we conclude that S” (C?) 
contains an invariant subspace equivalent to (j, 0). Noting that 


dim S” (C?) = dim (j, 0) = 2j + 1 


we conclude that (57/2, S” (C?)) ~ (j, 0). 
Example 5.36. (z, C7) The right-handed spinor representation 


Consider the right-handed spinor representation (II, C?) from Example 5.6. A quick 
calculation (do it!) reveals that the induced Lie algebra representation (7, C7) is 
given by 


u(X)=—-Xt, X €sl(2,Cr. 
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In particular, then, we have 7($;) = S; since the S; are anti-Hermitian, as well as 
a(K;) = A(S) = —ix(S;) = —iS;. 


We then have 


and hence 


01 00 
= N_= , My=M_=0. 


You can again check that (1,0) € C? is a highest weight vector, but this time with 
Ji = 0, jo = 1/2, and so the right-handed spinor representation of sI(2, C)g is just 
(0, 3). 
Example 5.37. (72;, S” (C®)) Symmetric tensor products of right-handed spinors 


As before, we can build other irreps by taking symmetric tensor products. Again, 
consider the 2th symmetric tensor product representation (S*/ 7, S% (C?)), 2j € 
N and the vector 


= 2j (72 
voo =e @ +++ Be, € SH (C. 
—— 
2j times 


Again, it’s straightforward to check that this is a highest weight vector with eigen- 
values (0, j), and we can conclude as before that (S~/ 7, S% (C?)) is equivalent to 


(0, j). 


So far we have used symmetric tensor products of the left-handed and right- 
handed spinor representations to build the (j, 0) and (0, j) irreps. From here, getting 
the general irrep (ji, j2) is easy; we just take the tensor product of (j1, 0) and (0, j2)! 
To see this, let voo € (j,0) and Uoo € (0,k) be highest weight vectors. Then it’s 
straightforward to check that 


voo ® Ùo, E (j, 9) ® (0,k) 


is a highest weight vector with ji = j, j2 = k, and so we conclude that (j, 0) & 
(0, k) contains an invariant subspace equivalent to (j, k). However, since 


dim[(j, 0) @ (0,k)] = dim (j,k) = (27 + 1)(2k +1) 


we conclude that these representations are equivalent, and so in general (switching 
notation a little), 


| (j1.j2) > Gi, 0) ® (0, j2). | 
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Thus, all the irreps of s{(2,C)g can be built out of the left-handed spinor 
(fundamental) representation, the right-handed spinor representation, and 
various tensor products of the two. 

As before, this classification of the complex finite-dimensional irreps of sI(2, C)p 
also yields, with minimal effort, the classification of the complex finite-dimensional 
irreps of SL(2,C). Any complex finite-dimensional irrep of SL(2,C) yields a 
complex finite-dimensional irrep of s{(2,C)g, which must be equivalent to (j1, j2) 
for some j1, j2. Since 


(i152) = (Sx @ SPH, SC) @ SC) 


we then conclude that our original SL(2,C) irrep is equivalent to (S ATI Q 
S??2TT, S% (C?) & S72(C’)) for some ji, j2. Thus, the representations 


(SHO @ SPA, SC) @ SC), 21,24 €N 


are (up to equivalence) all the complex finite-dimensional irreducible represen- 
tations of SL(2, C). 

How about representations of SO (3, 1)? We saw that in the case of SO(3), not 
all representations of the associated Lie algebra actually arise from representations 
of the group, and the same is true here. Say we have a complex finite-dimensional 
irrep (II, V) of SO(3,1),, and consider its induced Lie algebra representation 
(x, V), which must be equivalent to (j1, j2) for some jı, j2. Noting that 


iM, + iN, = in(L;) 
we have (again, be sure to distinguish x the number from z the representation!) 
ei 27 M:+iN:) 


voo = et? 7+ j2) V0.0 


as well as 


ei 2a GM, Fi Nz) voo = —2n-n(L-) Vo 


e 0 
= (e~?) voo 
= T(J) voo 


= V0.0 
so we conclude that 


eth, es ji +j EN, (5.75) 
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and thus only representations (j1,j2) satisfying this condition can arise from 
SO(3, 1), representations. (It’s also true that for any jı, j2 satisfying this condition, 
there exists an SO(3, 1), representation with induced Lie algebra representation 
(jı, j2), though we won’t prove that here.) 


Example 5.38. Rt The four-vector representation of SO(3, 1)o 


The fundamental representation (II, R4) is the most familiar SO(3,1), repre- 
sentation, corresponding to four-dimensional vectors in Minkowski space. What 
(j1, j2) does it correspond to? To find out, we first complexify the representation 
to (IIc, C*) and then consider the induced sl(2,C)g representation (sc, C*). 
Straightforward calculations show that 


0-100 0-i 0 0 
1[i 000 ilio o0 0 
| mal 
iM=zloooip 7% aea 
0010 00-10 


and this, along with expressions for M+ and N+ that you should derive, can be used 
to show that 


1 
. i 
voo = (e1 + ie2) = 0 
0 


is a highest weight vector with (j1, j2) = (1/2, 1/2). Noting that 


we conclude that 


Note that jı + j2 = 1/2 + 1/2 = 1 € N, in accordance with (5.75). E 


Before moving on to our next example, we need to discuss tensor products of 
sI(2, C)r irreps. The nice thing here is that we can use what we know about the 
tensor product of su(2) irreps to compute the decomposition of the tensor product 
of sI(2, C)p irreps. In fact, the following is true: 


Proposition 5.9. The decomposition into irreps of the tensor product of two 
sI(2, C)p irreps (j1, j2) and (ky, K2) is given by 
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(j1,j2) 8 (kı, K2) = aD (l,l) where |ji—ki| < h < ji +k, a5 


liz—ko] <h < fo the 


and each (1,12) consistent with the above inequalities occurs exactly once in the 
direct sum decomposition. 


Notice the restrictions on /; and l, which correspond to the decomposition of 
tensor products of su(2) representations. We relegate a proof of this formula to 
the appendix, but it should seem plausible. We’ve seen that one can roughly think 
of an sl(2, C)p representations as a “product” of two su(2) representations, and so 
the tensor product of two sl(2, C)g representations (which can both be “factored” 
into su(2) representations) should just be given by the various “products” of su(2) 
representations that occur when taking the tensor product of the factors. We’ll apply 
this formula and make this concrete in the next example. 


Example 5.39. A?R* The antisymmetric tensor representation of SO(3, 1)o 


This is an important example since the electromagnetic field tensor F”” lives in 
this representation. To classify this representation, we first note that A?R* occurs in 
the O(3, 1)-invariant decomposition 


R 8 Rt = R17! @ A(R‘) @ S” (BR), (5.77) 
where n7! = ni’’e u ® éy is the inverse of the Minkowski metric and 

S” (RÝ) = {T € SRS | nwT” = 0} (5.78) 
is the set of symmetric “traceless” 2nd rank tensors, where the trace is effected 
by the Minkowski metric 7. Note that this is just the O(3, 1) analog of the O(n) 
decomposition in (5.54). You should check that each of the subspaces in (5.77) 
really is O(3, 1) invariant. Complexifying this yields 

ct Q Ct =Z Cy a A? (C!) ran) S” (C$). 


Now, we can also decompose C* ® C4 using Proposition 5.9, which yields 


11 11 
i , = (0,0 1,0 0,1 1,1). 5.79 
(5-5) @ (3-3) -@Meane@nea.n (5.79) 
Now, clearly Cy! corresponds to (0, 0) since the former is a one-dimensional rep- 
resentation and (0, 0) is the only one-dimensional irrep in the decomposition (5.79). 
What about S?” (C4)? Well, it’s straightforward to check using the results of the 
previous example that 


voo = (e1 + ie) Q (e1 + ier) € S* (C4) 
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is a highest weight vector with (j1, j2) = (1, 1) [it’s also instructive to verify that 
Uo,9 actually satisfies the condition in (5.78)]. Checking dimensions then tells us that 
S” (C4) x (1, 1), so we conclude that 


A?(C*) > (1,0) @ (0,1). 


This representation is decomposable, but remember that this does not imply that 
A?(R*) is decomposable! In fact, A?(R*) is irreducible. This is an unavoidable 
subtlety of the relationship between complex representations and real representa- 
tions.!’ For an interpretation of the representations (1, 0) and (0, 1) individually, see 
Problem 5-13. 


5.12 Irreducibility and the Representations of O(3, 1) 
and Its Double Covers 


In this section we’ll examine the constraints that parity and time-reversal place on 
representations of O(3, 1) and its double covers. In particular, we will clarify our 
discussion of the Dirac spinor from Example 5.26 and explain why such so(3, 1) 
decomposable representations seem to occur so naturally. 

To start, consider the adjoint representation (Ad, s0(3, 1)) of O(3, 1). It is easily 
checked that the parity operator acts as 


Adp(Li) = Li, Adp(Ki) = —K;. 


Now say that we have a double-cover of O(3, 1), call it H (these certainly exist and 
are non-unique; see Sternberg [19] and the comments at the end of this section). H 
will have multiple components, just as O(3, 1) does, and the component containing 
the identity will be isomorphic to SL(2,C).'® Since H is a double-cover, there 
exists a two-to-one group homomorphism ® : H — O(3,1), which induces the 
usual Lie algebra isomorphism @ : s{(2,C)p —> so(3, 1). Now let P € H cover 
P € O(3,1), so that 6(P) = P. Then from the identity 


(Ady (X)) = Adam (O(X)) Vhe A, X €sl(2,C)p (5.80) 
which you will prove below, we have 


$(Adg(Si)) = Adp(L;) = L; 


For the whole story on this relationship, see Onischik [14]. 


'8This should seem plausible, but proving it rigorously would require homotopy theory and would 
take us too far afield. See Frankel [6] for a nice discussion of this topic. 
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as well as 
¢(Adp(K;)) = —Ki. 
Since @ is an isomorphism we conclude that 


Ads(S;) = ŽS; P~! = S; 


Adj(Ki) = PK; P~ = —K;. 


If (II, V) is a representation of H and (x, V) the induced sI(2, C)p representation, 
then this implies that 


—l 
Tp M; Tz! = N; eee 
IIs N; 3! = M; l 
Piti p = i- 


Let’s examine some consequences of this. Let W C V be an irreducible subspace 
of (x, V) equivalent to (j1, j2), spanned by our usual basis of the form 


B = {vk k |O< ki < 2ji, 0 < k2 < 2 jo}. 
We then have 


i M-II pvo, = iI pNzv00 = J2I pv0.0 
i N-II pvo, = i I p Mzvo,o = fill pvoo 
M+ TI pv00 = I p N+ voo =0 


NII pv00 = II p M+vo,.o =0 


and thus TI 5vo,o is a highest weight vector for (j2,j1)! We have thus proven the 
following proposition: 


Proposition 5.10. Let H be a double-cover of O(3,1) and (TI, V) a complex 
representation of H with induced s\(2,C)p representation (n, V). If W C V is 
an irreducible subspace of (n, V) equivalent to (j1,j2) and ji # jo, then there 
exists another irreducible subspace W’ of (x, V) equivalent to (j2, ji). 


It should be clear from the above that the operator IT( P) corresponding to parity 
takes us back and forth between W and W’. This means that even though W is an 
invariant subspace of the Lie algebra representation (x, V), W is not invariant under 
the Lie group representation (TI, V), since T(P ) takes vectors in W to vectors in 
W'! If W and W’ make up all of V, i.e. if V = W @ W’, this means that V is 
irreducible under the H representation II but not under the so0(3, 1) representation 
x, and so we have an irreducible Lie group representation whose induced Lie 
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algebra representation is not irreducible! We’ll meet two examples of this type of 
representation below. Note that this does not contradict Proposition 5.4, as the group 
H doesn’t satisfy the required hypothesis of connectedness. 


Exercise 5.43. Let P : H — G bea Lie group homomorphism with induced Lie algebra 
homomorpishm ¢ : h —> g. Use the definition of the adjoint mapping and of ¢ to show that 


(Ad, (X)) = Adom ($(X)) Vhe H, X €b. 
Exercise 5.44. Verify Eq. (5.81). You will need the result of the previous exercise! 
Example 5.40. The Dirac Spinor Revisited 


As a particular application of Proposition 5.10, suppose our representation (II, V) 
of H contains a subspace equivalent to the left-handed spinor (4, 0); then, it must 
also contain a subspace equivalent to the right-handed spinor (0, $). This is why the 
Dirac spinor is (3. 0) & (0, $). It is not irreducible as an SL(2, C) representation, 
but it is irreducible as a representation of a group H which extends SL(2,C) and 
covers O(3, 1). Oo 


The Dirac spinor representation is a representation of H, not O(3, 1), but many 
of the other sl(2,C)g ~ so(3, 1) representations of interest do come from O(3, 1) 
representations (for instance, the fundamental representation of so(3,1) and its 
various tensor products). Since any O(3, 1) representation IT : O(3,1) > GL(V) 
yields an H representation I o ® : H — GL(V), Proposition 5.10 and the 
comments following it hold for O(3, 1) representations as well as representations 
of H. In the O(3, 1) case, though, we can do even better: 


Proposition 5.11. Let (TI, V) be a finite-dimensional complex irreducible repre- 
sentation of O(3, 1). Then the induced s0(3, 1) representation (1, V) is equivalent 
to one of the following: 


Gj), 27 EN or (ji.j2) O Goji). 2ji,2j2 €N, j +j: €N, ji £ j2 
(5.82) 


Proof. Since ṣo(3, 1) is semi-simple, (7r, V) is completely reducible, i.e. equivalent 
to a direct sum of irreducible representations. Let W C V be one such representa- 
tion, equivalent to (j1, j2), with highest weight vector vo o. Then T(P )vo o = up is 
a highest weight vector with eigenvalues (j2, j1), and the same arguments show 
that I(T )vo0 = vr is also a highest weight vector with eigenvalues (jo, jı) 
(it is not necessarily equal to vp, though). The same arguments also show that 
II(PT)vo0 = vpr is a highest weight vector with eigenvalues (j1, j2) [recall that 
T is the time-reversal operator defined in (4.35)]. Now consider the vectors 


Wo,0 = Vo o + VPT 


uoo = Up + vr. 
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woo is clearly a highest weight vector with eigenvalues (j1, j2), and uoo is clearly 
a highest weight vector with eigenvalues (j2, j1). Using the fact that P and T 
commute, you can easily check that 

T(P)woo = I1(T)wo0 = uoo 

TI(P )uo.o = I(T )uo,o = woo. 


We can then define basis vectors 


Wkk = (M_)"(N_)P woo, 0< ki < 2j1, 0< k < 2j 
Uk, ky = (M-)}' (N-) uoo, 0 < ki < 2jo, 0< ko < 2 


which span so(3,1) irreducible subspaces W ~ (jı,j2) and U ~ (j2,ji). By 
Proposition 5.4 and the connectedness of SO(3,1),, W and U are also irreducible 
under SO(3, 1), and from (5.75) we know that jı + j2 € N. Furthermore, by the 
definition of the wx, 4. and ux, ,4,, as well as (5.81), we have 


T(P) Wey ko = I(T )Wr ey = Ukki 

T(P ue, k = I(T )uk ko = Whe ky 
and so Span{wg, 5, Uh} is invariant under O(3, 1). We assumed V was irreducible, 
though, so we conclude that V = Span{w,,x,,u,,,}. Does that mean we can 
conclude that V ~ (j1, j2) ® (j2, j1)? Not quite, because we never established that 
Woo and uo o were linearly independent! In fact, they might be linearly dependent, 


in which case they would be proportional, which would imply that each w;, ;, is 
proportional to ux, x, (why?), and also that 7; = j2 = j. In this case, we obtain 


V = Span{wk k2; Uh do} = Span{wk ko} X GD) 


which is one of the alternatives mentioned in the proposition. If wọ, and uo,9 are 
linearly independent, however, then so is the set {wk; k2, Ul, } and so 


V = Span{wk k Unh} = WOU x (ji. jo) ® Gin. jn) 


which is the other alternative. All that remains is to show that jı 4 j2, which you 
will do in Exercise 5.45 below. This concludes the proof. Oo 


Exercise 5.45. Assume that wo and uoo are linearly independent and that ji = j2. 
Use this to construct a nontrivial O(3, 1)-invariant subspace of V, contradicting the 
irreducibility of V. Thus if V is irreducible and wo and uo are linearly independent, 
then jı Æ j2 as desired. 
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Example 5.41. O(3, 1) Representations Revisited 


In this example we just point out that all the O(3, 1) representations we’ve met have 
the form (5.82). Below is a table of some of these representations, along with their 
complexifications and the corresponding so(3, 1) representations. 


Name V Ve so(3, 1) rep 
Scalar (trivial) R C (0,0) 

Vector (fundamental) Rt ct G, 1) 
Antisymmetric tensor (adjoint) | A?R4 A?Cct (1, 0) ® (0, 1) 
Pseudovector AR! AC! (ż, 5) 
Pesudoscalar A‘R* A‘*ct (0, 0) 
Symmetric traceless tensor SZ RY | s2(c4) A, D 


Note that the pseudovector and pseudoscalar representations yield the same 
so(3,1) representations as the vector and scalar, respectively, as discussed in 
Example 5.18. Thus we have a pair of examples in which two equivalent Lie algebra 
representations come from two non-equivalent matrix Lie group representations! 
Again, this doesn’t contradict Proposition 5.3 since O(3, 1) is not connected. Note 
also that the only representation in the above table that decomposes into more than 
one so(3, 1) irrep is the antisymmetric tensor representation; see Problem 5-13 for 
the action of the parity operator on this representation, and how it takes one back 
and forth between the so(3, 1)-irreducible subspaces (1, 0) and (0, 1). Oo 


Before concluding this chapter we should talk a little bit about this mysterious 
group H which is supposed to be a double-cover of O(3, 1). It can be shown!® that 
there are exactly eight non-isomorphic double covers of O(3, 1) (in contrast to the 
case of SO(3) and SO(3, 1), which have the unique connected double covers S U(2) 
and SL(2, C)). Most of these double covers are somewhat obscure and don’t really 
crop up in the physics literature, but two of them are quite natural and well studied: 
these are the Pin groups Pin(3, 1) and Pin(1, 3) which appear in the study of Clifford 
Algebras. Clifford algebras are rich and beautiful objects, and lead naturally to 
double covers of all the orthogonal and Lorentz groups. In the four-dimensional 
Lorentzian case in particular, one encounters the Dirac gamma matrices and the 
Dirac spinor, as well as the Pin groups which act naturally on the Dirac spinor. 
For details on the construction of the Pin groups and their properties, see Gdckeler 
and Schiicker [9]. We’ll have a little more to say about Dirac gamma matrices in 
Sect. 6.3. 


19See Sternberg [19]. 
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Chapter 5 Problems 


Note: Problems marked with an “x” tend to be longer, and/or more difficult, and/or 
more geared towards the completion of proofs and the tying up of loose ends. 
Though these problems are still worthwhile, they can be skipped on a first reading. 


5-1. Generalize the results of Exercise 5.5 and Example 5.8 by redoing the 
calculation for arbitrary G, II, V, and C(V). That is, let (II, V) be a finite- 
dimensional representation of G, and let TI be the induced representation 
on some function space C(V), which will further induce a Lie algebra 
representation 7 on C(V). Choose a basis B = {e;};=1.., for V and take 
the corresponding vector components v’ as coordinates on V. Show that 
with this basis and coordinates and for any X € g, 7 takes the form of the 
differential operator 


a) 
#(X) =-) Oy v! 5 


vio 
ij 


By specializing appropriately, reproduce (5.12) and (5.13). 


5-2. (*) In this problem we’ll develop a coordinate-based proof of our claim 
from Example 5.22. 


(a) Let X = Xe; Q e; € A?R", and define 


X = XVL(e) @e; = XÏ gipet Q ej € L(R") 
(X) = [X] € M,R). 


Find an expression for @(X) in terms of [X] and use it to show that 
P(X) E g. 

Prove that Ad(R) o ¢ = ¢ o A?°TI(R) by evaluating both sides on an 
arbitrary X € A?R” and showing that the components of the matrices 
are equal. You’ll need the expansion of X given above, as well as the 
coordinate form (5.33) of A°TI. For simplicity of matrix computation, 
you may wish to abandon the Einstein Summation Convention here and 
write the components of R € G as R;;, even though you're interpreting 
R as a linear operator. 


(b 


wm 


ij? 


5-3. In this problem we’ll develop a coordinate-free proof that the fundamental 
representation of SL(2, C) is equivalent to its dual. This will also imply 
that the fundamental representation of SU(2) is equivalent to its dual as 
well. 


(a) Consider the epsilon tensor in J, (C°), 


e=e Ax. 
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Using the definition (3.72) of the determinant, show that € is SL(2, ©) 
invariant, i.e. that 


€(Av, Aw) =e€(v,w) Wv,weC’, A€ SL(2,C). 
(b) Define a map 


Le: C? > C” 


v E> e€(U,:). 


Using your result from a), show that L, is an intertwiner, and also show 
that Le is one-to-one. Conclude that C? ~ C2" as SL (2, C) irreps, and 
hence as SU(2) irreps as well. 
(c) Given your result from a), how would you now interpret the matrix A 
from (5.47)? Compute [Le] in the standard basis and dual basis and 
show that [Le] = A. 
Assuming the very plausible but slightly annoying to prove fact that 
(S2(C2))* ~ S7(C2"), prove that every SU(2) representation is 
equivalent to its dual. One might think that this is self-evident since 
taking the dual of an irrep doesn’t change its dimension and for a given 
dimension there is only one SU(2) irrep (up to equivalence), but this 
assumes that the dual of an irrep is itself an irrep. This is true, but needs 
to be proven, and is the subject of the next problem. 


(d 


wm 


5-4. Show that if (II, V) is a finite-dimensional irreducible group or Lie 
algebra representation, then so is (II*, V*). Note that this together with 
Problem 2-5 also implies the converse, namely that if (II*, V*) is an irrep 
then so is (II, V). Use this to prove that every su(2) irrep is equivalent to 
its dual. 


5-5. If G is a matrix Lie group and (TI, V) its fundamental representation, one 
can sometimes generate new representations by considering the conjugate 
representation 


Il:G > GL(V) 
Aw A, 


where A denotes the matrix whose entries are just the complex conjugates 
of the entries of A. 


(a) Verify that I is a homomorphism, hence a bona fide representation. 
Show that for G = SU(n), the conjugate representation is equivalent 
to the dual representation. What does this mean in the case of SU(2)? 

(b) Let G = SL(2,C). Show that the conjugate to (4, 0) is (0, 5). 
There are a few ways to do this. This justifies the notation TI for the 
representation homomorphism of (0, 5). 
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5-6. (*) In this problem we’ll show that any complex matrix of the block 
diagonal form & 3) is related to a purely real matrix by a similarity 


transformation. This then implies the existence of the Majorana spinor, as 
mentioned in Example 5.26. 


(a) Consider C” along with its standard basis 6 = {e;};=1.., and linear 
operators M,(C). Let us decompose v e C” as well as D € 
M,,(C) into their real and imaginary parts, i.e. v = x + iy and 
D = E +iF where x,y € R” and E,F € M,(R). If we now 
consider V as a 2n-dimensional real vector space Vg with basis BR = 
{€1, €2,...,€n, 1@1,1@2,...,1@,}, show that D is given (inn x n block 
diagonal form) by 


E —F 
[D]sr = (7 K ). (5.83) 


(b 


wm 


Rewrite our basis vectors as Bg = {fi = ei, fa+i = ie }i=t..n. Now 
take a deep breath and complexify this space, to get a new complex 
vector space (Va)c with complex dimension 2” (twice that of our 
original space V). Now change bases in (Va)c from Br = { fj}i=1-2n 
to B = {fi + ifa+i, fi — ifa+i}i=1-n. By making the appropriate 
(complex) similarity transformation, show that in this basis 


_(E+iF 0 
[Piss = ( i Par (5.84) 


This implies that any complex matrix of the form (5.84) is related to a 
real matrix by a similarity transformation. 

If you are curious about the explicit form of the Majorana repre- 
sentation, compute the Dirac representation of s{(2,C)p from that of 
SL(2,C). Then use the similarity transformation from Example 5.21 
to obtain matrices of the form (5.84) for all your s(2,C)p generators. 
Rewriting these as real matrices as per (5.83) gives the Majorana 
representation of sl(2, C)p. 


= 
O 
< 


5-7. Let (x, V) be a representation of a Lie algebra g and assume that m : 
g — gl(V) is one-to-one (such Lie algebra representations are said to be 
faithful). Show that there is an invariant subspace of (l, V © V*) that is 
equivalent to the adjoint representation (ad, g). 


5-8. In this problem we’ll meet a representation that is not completely 
reducible. 


(a) Consider the representation (I1, R?) of R given by 


I : R > GL(2,R) 
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Verify that (II, R?) is a representation. 


(b) If IT was completely reducible then we’d be able to decompose R? as 


R? = V @ W where V and W are one-dimensional. Show that such a 
decomposition is impossible. 


5-9. In this problem we’ll show that Z is semi-simple, i.e. that every finite- 
dimensional representation of Zz is completely reducible. Our strategy will 
be to construct a Z»-invariant inner product on our vector space, and then 
show that for any invariant subspace W, its orthogonal complement W+ 
is also invariant. We can then iterate this procedure to obtain a complete 
decomposition. 


(a) 


(b) 


(c) 


Let (II, V) be a representation of Z2. We first construct a Z-invariant 
inner product. To do this, we start with an arbitrary inner product (-|-)o 
on V (which could be defined, for instance, as one for which some 
arbitrary set of basis vectors is orthonormal). We then define a “group 
averaged” inner product as 


(v]w) = $ (Mnv|Taw)o - 


hEZy 


Show that (-| -) is Zp-invariant, i.e. that 
(I,v|gw) = (ww) Vu,weV, ge. 


Assume now that there exists a nontrivial invariant subspace W C V 
(if no such W existed, then V would be irreducible, hence completely 
reducible and we would be done). Define its orthogonal complement 


Wt ={veEV|(vlw) =0Vwe W}. 
Argue that there exists an orthonormal basis 
B= {w1,..., Wk, Uk+1,---,Un} where w; € W, i =1,...,k, 


and conclude that V = W ẹ W+. 

Show that W+ is an invariant subspace, so that V = W @ W+ is 
a decomposition into invariant subspaces. We still don’t know that 
W is irreducible, though. Argue that we can nonetheless iterate our 
above argument until we obtain a decomposition into irreducibles, thus 
proving that (II, V) is completely reducible. 
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5-10. (*) In this problem we’ll sketch a proof that the space of harmonic 
polynomials H; (R°) has dimension 2/ + 1. 


(a) Recall our notation P; (R°) for the space of kth degree polynomials on 
R?. Assume that the map 


A: Py (IR?) — Pr_2(R?) (5.85) 


is onto. Use the rank-nullity theorem to show that dim H; (R?) = 2/-+1. 
If you need help you might consult Example 3.24, as well as (3.89). 

(b) To complete the proof we need to show that (5.85) is onto. I have 
not seen a clean or particularly enlightening proof of this fact, so I 
don’t heartily recommend this part of the problem, but if you really 
want to show it you might try showing (inductively on k) that the two- 
dimensional Laplacian 


A: Py(R?) — PR’) 


is onto, and then use this to show (inductively on k) that (5.85) is onto 
as well. 


5-11. Let T and U be commuting linear operators on a complex vector space 

V, so that [T, U] = 0. We will show that T and U have a simultaneous 
eigenvector. 
Using the standard argument we employed in the proof of Proposition 5.5, 
show that T has at least one eigenvector. Denote that vector by va and 
its eigenvalue by a. Let V, denote the span of all eigenvectors of T with 
eigenvalue a; V, may just be the one-dimensional subspace spanned by va, 
or it may be bigger if there are other eigenvectors that also have eigenvalue 
a. Use the fact that U commutes with T to show that V, is invariant under 
U (i.e., U(v) € Va whenever v € V,). We can then restrict U to V; to get a 
linear operator 


Uly, = Ua € L(Va). 


Then use the standard argument again to show that U, has an eigenvector 
vo € Va. Show that this is a simultaneous eigenvector of T and U. 


5-12. In this problem we’d like to show that the tensor product “distributes” 
over direct sums, in the sense that for any vector spaces V, W, and Z, there 
exists a vector space isomorphism 


:VEW)®Z>VOZ)OW OZ). 
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Define such a map on decomposable elements by 
d:(V,wW) 8z vzw ®z) 


and extend linearly. Show that ¢ is linear (you’ll have to be careful about 
how addition works in the direct sum spaces and the tensor product spaces) 
as well as one-to-one and onto, so that it is a vector space isomorphism. 


5-13. (*) In this problem we’ll study the O(3, 1) representation (A°TI, A7C*) 
and its decomposition into (1, 0) @ (0, 1) under SO(3, 1), and so(3, 1). 


(a) Define the star operator x on A*C* by 


x: A?C* > A?Ct 


1 - 
ei A ej | zean” n” em A en 


and extending linearly. Here n’” are the components of the inverse of 


the Minkowski metric and €;;,; are the components of the usual epsilon 
tensor on R*. Show that * is an intertwiner between A?C* and itself 
when viewed as an SO(3, 1), representation, but not as an O(3, 1) 
representation. You will need (4.19) as well as your results from part 
(a) of Problem 3-4. 

Compute the action of * on the basis vectors f;, i = 1,...,6 defined 
in Example 5.24. Use the f; to construct eigenvectors of x, and show 
that * is diagonalizable with eigenvalues +i, so that A*C* decomposes 
into V}; @ V_; where V4; is the eigenspace of * in which every vector 
is an eigenvector with eigenvalue +7 and likewise for V_;. Then use 
Schur’s lemma to conclude that any SO(3, 1),-irreducible subspace 
must lie entirely in V}; or V_;. In particular, this means that A*C* 
is not irreducible as an SO(3, 1), or s0(3, 1) representation. 

A convenient basis for V}; which you may have discovered above is 


(b 


wm 


(c 


wm 


> 


LS fi+ifa 
+2 = h+ifs 
+3 = f3 + ife- 


ps 


æ 


Likewise, for V_; we have the basis 


v = fi> ifa 
v2 = h-ifs 
03 = f — ife. 
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Now consider the vectors 


v41 + iv+42 


V0,0 


Woo = V- + 1v_2. 


Show that these are highest weight vectors for (0, 1) and (1, 0) respec- 
tively. Then count dimensions to show that as so(3, 1) representations, 
Vai x (0, 1) and Vi x (1, 0). 

(d) Show directly that A7TI(P)(v4;) = v_;,j = 1,2,3 and likewise for 
T, so that P and T interchange V}; and V_;, as expected. 


5-14. Let F, be the electromagnetic field tensor from Example 3.16. Restrict 
the * operator from the previous problem to AR‘ and apply it to F to 
obtain another antisymmetric (2, 0) tensor, known as dual field tensor. This 
may be familiar from relativistic treatments of electromagnetism, such as 
found in Griffiths [10]. 


Chapter 6 
The Representation Operator 
and Its Applications 


In this chapter we’ll introduce the somewhat abstract notion of a representation 
operator, which is absent from much of the physics literature but allows for a 
unified treatment of several topics of interest in quantum mechanics, including 
tensor operators, spherical tensors, quantum-mechanical selection rules, and the 
Wigner—Eckart theorem. As usual, we begin with a heuristic introduction in which 
we’ll try to dispel some of the widespread confusion about spherical tensors and 
the Wigner—Eckart theorem, as well as motivate the definition of a representation 
operator. 


6.1 Invitation: Tensor Operators, Spherical Tensors, 
and Wigner—Eckart 


Consider an SO(3) representation (I, H), so that M : SO(3) —> GL(H). 
We will assume that H is a complex Hilbert space, since we have quantum- 
mechanical applications in mind. The representation II on H induces a tensor 
product representation (I1;, £(#)) as per (5.29): 


TH} (R) T = T1(R) TOR)! Y R € SO(3), T € L(H). 


We emphasize that for (TI}, L(H)), the vector space being acted upon is the space 
of operators on H, and that R € SO(3) acts on these operators via similarity 
transformations by the operators IT(R). 

Now, as an SO(3) representation, £(H) typically contains nontrivial invariant 
subspaces. This is not as unfamiliar as it may sound, as the next example will show. 
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Example 6.1. Vector operators as invariant subspaces of L(L? (R?)) 


Let H = L?(R3). What are some SO(3)-invariant subspaces of L(H)? If we 
consider the subspace 


Span{ px, Dy, Bp} C L(H) 


then this is clearly an invariant subspace of £(H), since 


MICR) pj = TR) pj T(R) = DO Ry Bi (6.1) 
J 


as you will show in Exercise 6.1. The same is true for Span{x, f, 2}. Furthermore, 
it should be clear that as (complex) SO(3) representations, 


Span{ Px, Py, Dz} = Span{x, f, Z} > C: 


Since the sets of operators { Px, Py, Pz} and {X, 9, 2} each span a space equivalent 
to the (complex) vector representation of SO(3), these sets are known as vector 
operators (see Exercise 6.2 for the equivalence of this definition with that from 
Sect. 3.7). Note that by (6.1), the elements of the vector operators transform under 
TI! like the basis vectors e; of R°. 


Exercise 6.1. Prove the second equality in (6.1). Do this by letting both sides act on a p 
eigenket |p), where II(R) |p) = | Rp). 


Exercise 6.2. Prove that this definition of a vector operator is equivalent to the old one. 
Do this by letting R = e‘ in (6.1), differentiating at £ = 0, and identifying ix(L;) 
with what we’ve called the total angular momentum operator J;. This should yield the 
definition (3.57). 


Example 6.2. 2nd rank tensor operators as invariant subspaces of L(L? (R3)) 


Vector operators aren’t the only kind of invariant subspaces lurking around in L(H). 
Consider also the subspace 


Span{; fj} C LH) 


(in what follows Roman indices run from 1 to 3 unless otherwise noted). These 
operators transform under R € SO(3) as 


MIR: pj) = T(R)Xi f; (RY 
= (T(R) &; T1(R)“')C(R) f; T(R’) 


= SO RRR Bi 
k,l 


= J Rei Rik Îi, 
kll 
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which is of course just the transformation law for a 2nd rank tensor! Thus, the set 
{X; p;} is known as a 2nd rank tensor operator. Again, the elements X; f; of the 
tensor operator transform like the basis vectors e; @ e; of the (complex) 2nd rank 
tensor representation C? @ C?. E 


We can generalize the previous two examples with the following definition: 
A (r,s) tensor operator on a complex SO (3) representation (TI, H) is a set of grts 
operators (Ta j3 C LCA) that transform under TI] like the basis vectors 


fell @---@e"" @e;, Q-Q ej} of T? (C?). Put another way, 
Span a ee } eae (C3) 


as SO(3) representations. 
Now, recall that 7; (C?) is typically decomposable, and so the span of a tensor 
operator should be decomposable also. We illustrate this in the next example. 


Example 6.3. Decomposition of tensor operators into spherical tensors 


Consider again the tensor operator {%; Pj}. We saw that the span of these operators 
forms an SO(3)-invariant subspace equivalent to 7P (C?), so by Example 5.33 
we must have a decomposition into irreducibles as 


Span{X; pj} ~ Vo BV @ V2. (6.2) 
Here Vo, V;, and Vz should correspond to Cg, A2(C3), and S” (C3) as per (5.54). 


But what does this decomposition look like, explicitly? 
Since g = )/; e; Q e;, it’s clear that Vo should be given by 


Vo = Span | >> Ri Bi 


L 


= Span{ x-p }. (6.3) 


(0) 
To 


The appearance of the dot product here shouldn’t be a surprise, as we know Vo must 
be spanned by a scalar. The T® will be explained momentarily. 

As for the vector representation V;, we know it should correspond to A? (C?) c 
Le). This means we’re looking for some antisymmetric combinations of the x; 
and p; which transform like a vector; these, of course, are nothing but the angular 
momentum operators Îi = a: Eijk Xj Pk | Thus 


A 1 4 A in I -a x 
Vi = Span{l;} = Span § —=(L;,+iL,), L} , -—=(Ly—-ilLy,)?. (6.4) 
1 pan{L;} P a y) Fs y) 


—_—_—_—_—_—_—_—" r® <A 
0 0) 
Ti 
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In the last expression we switched to (normalized) J, eigenvectors satisfying (5.65), 
and again the fi will be explained momentarily. 

For V>, the simplest approach is to recall that Span{%; fj} ~ T?(C?) > V1 @Ni, 
and so our preferred J, eigenbasis is just given by the analogs in Span{*; p;} of the 
vectors in (3.60). Translating from the Dirac notation there, using the J, eigenbasis 
for vector operators given in (6.4), and also normalizing gives 


A AL PR A 2 
LR HIÐ +i py) = Ty”, 


+ (2dr + ip) e HiS) = T)?, 


V2 = Span (6.5) 


—1 (26. —i fy) + Ĝi) f) = TÊ, 


2) 
TS 


5(% —i9)(Px — i Py) 


In (6.4) and (6.5) one can see a strong analogy with the / = 1, 2 spherical harmonics 
(cf. Exercise 2.2). In fact, what we have done is decompose the tensor operator 
{X; Pj } into three subsets 


composed of the operators T® on the right-hand sides of (6.3)-(6.5).! By construc- 
tion each T” is a set of 2/ + 1 operators which transform like the v4 from (5.65), 
or alternatively like the spherical harmonics y Accordingly, each T is said to be 


a spherical tensor operator (or just spherical tensor) of degree l. We can thus say 
that 


A spherical tensor of degree / is a collection of (2/ + 1) operators T” = 
{TO} m=... whose elements T® transform under SO(3) similarity 


m 
L 


transformations like the spherical harmonics Y. 


Returning to our original question, we can then decompose Span{ŝ; f; } as 


Span{X; p;} ~ SpanT® @ Span T” ® Span T®, 


which we might call a “decomposition of {%; p;} into spherical tensors.” Oo 


'Note that the ‘/’ in the superscript of TO is in parentheses; this is because the / isn’t really an 
active index, but just serves to remind us which SO(3) representation we’re dealing with. 
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With an understanding of spherical tensors now in place, we can give a quick, 
heuristic overview of angular momentum selection rules and the Wigner—Eckart 
theorem. A more precise treatment will be given in the following sections. Let us 
switch to Dirac notation, and consider the product i \l,m ) of a J;, J? eigenket 


|Z, mı) and degree q spherical tensor operator ie Can the SO(3) transformation 
properties of the factors tell us anything about the transformation properties of the 
product? The answer is yes, and in fact 


T(R) TW |1, mi) = THR) TP TCR) TI(R) |L, mi) 


= T(R), T TRY; |2, n1) (6.6) 
where IT, (Ryn, are just the matrix elements of R in the preferred basis for V,, and 


similarly for T(R}? The appearance of these matrix elements in (6.6) tells us 
that: 


T |Z, mz) lives in a subspace equivalent to V, ® V;. 


Colloquially, when an operator that lives in V} acts on a vector that lives in V;, the 
result behaves like an element of V, @ V; ! 

This unsurprising fact actually has far-reaching consequences. Take another 
angular momentum eigenket |j, m i) and consider the inner product 


(J; m| TP |Z, mı). Since T |l,m;}) transforms like an element of V, ® V;, 
this inner product must vanish unless V; C V} &® V;. This, combined with (5.56), 
immediately gives the very useful angular momentum selection rule 


(mi| T@ lm) =0 unless |! -q| < j <1 +4. 


Now suppose that indeed V; C V, ® V;, so that the above inner product is nonzero. 
Then a glance back at Example 3.21 shows that this inner product bears a strong 
resemblance to the Clebsch—Gordan coefficients (j,m;|m,,m)) from (3.59). The 
content of the Wigner—Eckart theorem is that this is more than just a resemblance, 
but is in fact a strict proportionality: 


Proposition 6.1 (Wigner—Eckart I). Let (TI, H) be a representation of SO(3) on 
a quantum-mechanical Hilbert space H. Let T® = TW be a spherical tensor of 
I,mz), j,m;) be J., J? eigenkets. Then 


degree q, and let 
(j.m; | Ty |}, mı) = c(j, m; |mq, mı) 


where c is independent of mı, mq, and mj. 


We will state this more generally, precisely, and with proof in the next section. 


2?These are known as Wigner functions or Wigner D-matrices. 
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6.2 Representation Operators, Selection Rules, 
and the Wigner—Eckart Theorem 


You may have noticed that the definitions of vector, tensor, and spherical tensor 
operators from the last section were somewhat unwieldy; in particular, they relied 
on an equivalence between the transformation properties of specific operators and 
that of specific basis vectors for the V;. It would be preferable to have a basis- 
independent definition which subsumes the previous definitions. To that end, we 
introduce in this section the notion of a representation operator, which generalizes 
the notions of vector and tensor operators as well as spherical tensors. We’ll then 
use representation operators to derive a fundamental quantum-mechanical selection 
rule, which lays the foundation for the various selection rules one encounters in 
standard quantum mechanics courses. Representation operators will also play a key 
role in the Wigner—Eckart theorem, as we’ll see. 

Given a representation (IIo, Vo) of a group G on some auxiliary vector space Vo, 
as well as a unitary representation (II, H) of G on some Hilbert space H, we define 
a representation operator to simply be an intertwiner between Vo and £(H), or in 
other words a linear map p : Vo > L(H) satisfying 


p(THo(g)v) = MDW) YgeG,v eV. (6.7) 


Note that in terms of maps between Vo and L(H), this just says 


pollo(g)=M(g)op VgeG. 


What this definition is saying, roughly, is that we have a subspace p(Vo) C L(H) 
which, even though it’s composed of operators acting on H, actually transforms 
like the space Vo under similarity transformations by the operators 1g. Note that, 
strictly speaking, the representation operator p is not itself an operator, but an 
intertwiner between representations. 

How does this definition subsume the previous ones? Let Vo have a basis {e;}; 
then since p is linear, it’s completely determined by its action on this basis. Plugging 
a basis vector e; into (6.7) then yields 


TI(g)p(e:)F1(g)~' = (To(g));’ plej). 
If we set p(e;) = B;, this becomes 
II(g) Bi TI(g)~! = (Mo(g));! By. (6.8) 


This, of course, just says that under the representation 1, the B; transform like 
basis vectors of the representation (IIo, Vo), which was how we defined vector 
operators, tensor operators, and spherical tensors in the first place! In fact, if we 
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take G = SO(3) and Vy = R’, then (6.8) becomes (writing the matrices of the 
fundamental representation with both indices down) 


TI(R)B;T(R)"! = X` RyiB; 
j 


which is of course just (6.1). To get tensor operators or spherical operators, we just 
take Vo = 7,’ (R°) or Vo = Hı (R°), respectively. 

Now that we have defined representation operators, we can go on to formulate 
the fundamental selection rule from which the usual quantum-mechanical selection 
rules can be derived. Then we can state and prove the Wigner—Eckart theorem, 
which is a kind of complement to the angular momentum selection rules. First, 
though, we need the following fact, the proof of which we only sketch. The details 
are deferred to the problems referenced below. 


Proposition 6.2. Let W, and Wn be finite-dimensional inequivalent irreducible 
subspaces of a unitary representation (TI, H) equipped with an inner product (-|-). 
Then W, is orthogonal to W2. 


Proof sketch. Define the orthogonal projection operator P : H — Wy to be the 
map which sends v € H to the unique vector P(v) € W satisfying 


(P(v)|w) = (wlw) Vwe WwW. 


This is depicted schematically in Fig. 6.1. You will check in Problem 6-1 that such 
a vector P(v) exists and is in fact unique. If we now restrict P to Wi, we get 
Plw, : Wi > W, and using the unitarity of TI and the invariance of the W; one can 
show that P|, is an intertwiner. One can then use Schur’s lemma to conclude that 
P\w, = 0, which then implies that W; is orthogonal to W2. To fill out the details, 
see Problem 6-2. Oo 


With this in hand, we can now state and prove 


Fig. 6.1 Action of the orthogonal projection operator P : H — Wh ona vector v E€ H 
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Proposition 6.3 (Selection Rule). Let G be a semi-simple group, and let W, 
and Wz be finite-dimensional inequivalent irreducible subspaces of a unitary 
representation (TI, H) of G. Also let p : Vo —> L(H) be a representation operator, 
where (Ilo, Vo) is some auxiliary representation. Then 


(wi|p(v)w2) =0 Vuev, wi € W; 


unless the decomposition of Vo ® Wh into irreducibles contains a representation 
equivalent to W,. 


Before proving this, note that this proposition just formalizes what we observed 
in the last section: namely, that a vector of the form p(v)w2 is a kind of “product” 
of elements of Vo and W3, and thus transforms like something in Vp) & Wz = U; ® 
---@U;,. Thus, for there to be any overlap with an irreducible subspace W1, W; must 
then be equivalent to one of the Uj. 


Proof of proposition. Define a map by 


T:WO W >H 
v @ w2 > p(v)w2 (6.9) 


and extend linearly to arbitrary elements of Vo ® W2. You can check that this is a 
linear map between vector spaces, and so the image T(V ® W) = D C Hisa 
vector subspace of H. Now, since G is semi-simple we can decompose Vo ® W) into 
irreducibles as 


VY @®W2=U,@8:--- BU, 


for some irreps Up. As you will show below, the fact that p is a representation 
operator implies that T is in fact an intertwiner, and this further implies that the 
kernel of T (cf. Exercise 4.19) is an invariant subspace of Vo ® W2. This means that 
(with a possible relabeling of the U;) we can write the kernel of T as U; @---® Um 
for some m < k, which then implies that D is equivalent to Um+1 @--- ® Ug. If 
none of the U; are equivalent to W,, then by Proposition 6.2, every vector in D is 
orthogonal to every vector in W,, i.e. 


(wilp(v)w2) =0 Wve Vo, wi €W. 


which is what we wanted to prove. Oo 


Exercise 6.3. Quickly show that the fact that p is a representation operator implies that T 
is an intertwiner. Show further that the kernel of T is an invariant subspace of Vo ® W2. 


We’ll now use this generalized selection rule to reproduce some of the familiar 
selection rules from quantum mechanics. 
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Example 6.4. Parity Selection Rules 


Let (II, H) be a complex unitary representation of the two-element group Z2 = 
{I, P} where P is the parity operator. Let ve, vg E€ H be parity eigenstates 
with eigenvalues cy, cg. If cy = 1, then vg spans the one-dimensional trivial 
representation of Z2, and if cy = —1 then vy spans the alternating representation. 
Likewise for cg. Now let Vo be another one-dimensional irrep of Z with parity 
eigenvalue co, and let B € p(Vo), where p is a representation operator. Then a little 
thought shows that the selection rule implies 


(vg|Bug) = 0 unless cg = CoCa. 


Thus if is B is parity-odd (co = —1) then it can only connect states of opposite 
parity, and if it is parity even (co = +1) then it can only connect states of 
the same parity. If we were looking at dipolar radiative transitions (which emit a 
photon) between electronic states of an atom or molecule, the relevant operators are 
the components of the dipole operator p which is parity odd (since it is a vector 
operator). The above then tells us that dipolar radiative transitions can only occur 
between electronic states of opposite parity. 


Example 6.5. Angular Momentum Selection Rules 


Let (II, H) be a complex unitary representation of SU(2), with two subspaces W, 
and W; equivalent to V; and V; respectively, 2/,27 € Z. Also suppose we have a 
SU(2) representation operator A : V} > L(H), q € Z (in other words, a spherical 
tensor). Then for any v € W,, v’ € W;, A € A(V,), the selection rule tells us that 


(v'|Av) =0 unless -q| <j </ +4. 


If we again consider a dipolar radiative transition between electronic states of an 
atom or molecule, the relevant operator is still the dipole p whose components p; 
live in p(V), and so we find that a dipolar radiative transition between states with 
angular momentum j and/ can only occurif/—1< j7 </ +1. Oo 


The famous Wigner—Eckart theorem can be seen as a kind of complement to the 
angular momentum selection rule above. In the notation of the previous example, 
the Wigner—Eckart theorem says (roughly) that when (v’| Av) is not equal to zero, it 
is still tightly constrained and is in fact determined up to a constant by the fact that 
A is a representation operator. The precise statement is as follows: 


Proposition 6.4 (Wigner—Eckart II). Let (II, H) be a complex unitary represen- 
tation of SU(2), with two subspaces W; and W; equivalent to V; and Vj respectively, 
21,27 € Z. Also suppose we have two SU(2) representation operators A,B : V} > 
L(H), q € Z, which yield two spherical tensors with components Ak = A(vx), 
By = Bug), 0 < k < 2q + 1. Finally, assume that 


(v'|Acv) #0 for some k and v € Wi, v' € Wj. (6.10) 
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Then for all k and w € W;,w’ € Wj, we have 
(w'|Biw) = c(w' | Aw) (6.11) 


for some constant c € C which is independent of k, w, and w’. 


Proof. Let £(W;, W;) denote the vector space of all linear maps from W; to W;. By 
restricting to W; and using the orthogonal projection operator P; : H — Wj, we 
can turn A : V} —> £(H) into a map 


A: V; > LIM, Wj) 


v = P; o A(v)| w,- 


All we’ve done here is take the linear operator A(v) € L(H) and restrict it to W; and 
then project onto W;. Now, since W,, W; are SU(2)-invariant subspaces, the vector 
space L(W;, W;) actually furnishes a representation (T1!, LW, W;)) of SU(2) by 


IT = DTO) T € L(Wi, Wj), g € SU(2). 


You will check below that this is a bona fide representation, and is in fact equivalent 
to the tensor product representation W* @ W; ~ W; ® Wj! Furthermore, since the 
action of SU(2) on H commutes with restriction and projection (cf. Problem 6-2), 
it’s not hard to see that A is an intertwiner. From (6.10) we know that A is not zero, 
and so from Schur’s lemma we conclude that £(W;, W;) ~ W; ® W; has a subspace 
U, equivalent to V,, and that A is a vector space isomorphism from V, to U4. 

Now, we can also use our second representation operator B to construct a second 
intertwiner B : V, —> U, C L(Wı, W;). Then we invoke the corollary of Schur’s 
lemma that you proved in Exercise 5.34 to conclude that B = cA. But this then 
means that 


Ä (vr) = c B(x) Yk 
> (w'|A(oz)w) = c(w’|B(vx)w) Y k and w € Wi, w € W; 
=> (w|Akw) =c(w|Bew) Ykand we W, w eW; 


and so we are done. We here used the definition of A, the definition of the orthogonal 
projection operator P;, and the definitions Ay = A(vk), Bk = B(ux). Oo 


Exercise 6.4. Show that if T € L(W,,W;) then so is II(g)TII(g)~!, so that 
(Hi! , £(W;, W;)) really is a representation; in fact, it is an invariant subspace of (IT}, £ (H)). 
In analogy to the equivalence between V* @ V and L(V), show that (I1!, £(W, W;)) is 
equivalent to W,“ ® W;, which by Problem 5-4 is equivalent to W; @ W;. 


You may have noticed that this is not the way we stated the Wigner—Eckart 
theorem earlier, which may also be familiar from advanced quantum mechanics 
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texts like Sakurai [17]. To make the connection, consider the intertwiner 


T:V,@W > Wj 
v 8 w |> P; (A(v)w). 


If we work with standard bases {Vm }m=0—24; {Wn }n=0-21; {w} p=0-2j for V}, Wi, 
and W;, then this map has components 


SE 
` 
| 


mn 7 T (Um Wn, w”) 


= w” (P; (AUm)wa)) 
1 

ICAL 
1 


[lwo 


II 


(w, | P; (AmWn)) 


II 


(w, |AmWn). (6.12) 


Now let’s switch to the orthonormal bases familiar from quantum mechanics, which 
look like 


qm} €V -4<m <q 
|, mı) € Wi -l <m <l 
|i.m;) € W; -j<mj;zj 


(notice the prime on the last set of vectors, which will distinguish it from vectors 
in V,; ® W; with the same quantum numbers). With this basis and notation, the 
components (6.12) of T become the matrix elements (j, mj i Am, |l, mı). 

What do these matrix elements have to do with Clebsch-Gordan coefficients? 
Recall that V} & W; has two convenient sets of orthonormal basis vectors: 


q, mı) Q |, mı) = |ql; mq, mı} €V; 8 W —q <m; <q, —l<mı<l 

|V, my) €V,@ W; |i —q|<l'<l +q, —l'<mp<l'. 
The Clebsch—-Gordan coefficients are just the inner products (/’, my |gl;k, mj ) of 
these basis vectors. Since T : V} & W; —> W; is nonzero, V} ® W; must contain 


a subspace U; equivalent to W; and so we can consider the orthogonal projection 
operator P : V} ® W; —> U;j. By Problem 6-1 this is given by 


P:V,@W > Uj 


|ql; m,m) > > (j,m; |ql;mq,mj) |j,m;) 


—j<mj<j 
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and is an intertwiner by Problem 6-2. By then making the obvious identification of 
U; with W; and hence | j,m i) e> |j,m ne we get the intertwiner 


P': V8 W > Wj 


|ql;mg,mi) > > (j.m; |gl:mg,mi)|j.mj) 


—jsxmjsij 


whose components are nothing but the Clebsch—Gordan coefficients! By Wigner- 
Eckart, though, this intertwiner must be proportional to T, and so its components 
must be proportional to those of T. We thus have the following component version 
of the Wigner—Eckart theorem: 


Proposition 6.5 (Wigner-Eckart III). Let (II, H) be a complex unitary represen- 
tation of SU(2), with two subspaces W; and W; equivalent to V; and V; respectively, 
21,2j € Z. Also suppose we have a degree q spherical tensor A = {Am,}. Then 


(j,mj| Am, |Z, mz) =c (j,m; |ql; m4; mı} 


where c is a constant independent of mı, mq, and mj. 


6.3 Gamma Matrices and Dirac Bilinears 


We conclude this short chapter with what is essentially an extended example, which 
involves both the representation operators we’ve met in this chapter and the O(3, 1) 
representation theory we developed in the last. This section relies on a familiarity 
with the theory of the Dirac electron; if you have not seen this material, and in 
particular are unfamiliar with gamma matrices and Dirac bilinears, then this section 
can be skipped. This section borrows heavily from Frankel [6]. 

Let (D = II @ TH, C’) be the Dirac spinor representation of SL(2, C), so that 


D:SL(2,C) > GL(4,C) 


AS DAS & i) 


We claim that the gamma matrices y, can be seen as the components of a 
representation operator y : Rt —> £(C*) = M,(C), where R4 is the Minkowski 
four-vector representation of SL(2,C). To define y, we need the following two 
identifications of R* with H> (C), where X = (x,t) € Rt and o = (Ox, Oy, Oz): 

Rt <> HA (© 

X <> X =x- 0o +tI (6.13a) 

X <> X* =x-o-tl. (6.13b) 
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Note that (6.13a) is just the identification we used in Examples 4.23 and 5.5 in 
defining the four-vector representation of SL(2,C), and (6.13b) is just a slight 
variation of that. Now, using the well-known property of the sigma matrices that 

Oj0; +Oj;0O; = 2ôij, (6.14) 
you will verify below that 


X* = n(X, X)X;!. (6.15) 


Also, if pọ : SL(2,C) —> SO(3, 1), is the homomorphism from Example 4.23 which 
defines the four-vector representation, then by the definition of pọ we have 


(p(A)X)x = AX, At (6.16) 
which when combined with (6.15) yields 
(p(A)X)* = A! X* A! (6.17) 


as you will also show below. If we then define a map y by 


you can then use (6.16) and (6.17) to check that y is a representation operator, i.e. 
that 


y(p(A)X) = D(A)y (X) D(A)! (6.18) 


If we define the gamma matrices as y, = y(e,,), then we find that 


—I 0 
= = F wl 
Y5 = V172Y3V4 (| 3 (6.19) 


Up to a few minus signs that have to do with our choice of signature for the 
Minkowski metric, as well as the fact that most texts write yo instead of y4, this is 
the familiar chiral representation of the gamma matrices (the chiral representation 
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is the one in which ys is diagonal). If we let A,” be the components of p(A), we 
can take X = e, in (6.18) to get 


Ap Yv = D(A)y, D(A)! (6.20) 


which is how (6.18) usually appears in physics texts. Another important property is 
that the gamma matrices satisfy the fundamental anticommutation relation 


YuYv + Wp = 2 (6.21) 


which you can easily verify. This relationship is the starting point for Clifford 
algebras, which we also mentioned at the very end of Chap. 5.° 


Exercise 6.5. Verify (6.15), (6.17), (6.18), and (6.21). 


Now we’d like to construct the Dirac bilinears. Denote an element of C* by 


= A 
á ts 
where yz and Wp are two-component spinors living in (4,0) and (0, 4) respec- 
tively. Also define the row vector y by 


V = (VR VE) 


where the “*” denotes complex conjugation (note that the positions of the right- and 
left-handed spinors are switched here). You can easily check that if y transforms 
like a Dirac spinor, i.e. Y + D(A), then y transforms like 


TESI eA 0 ) = PDA. (6.22) 


If we consider the associated column vector YT, it transforms like 


wo (40 Re 


which you should recognize as the representation dual to the Dirac spinor, since 
the matrices of dual representations are just the inverse transpose of the original 
matrices. However, we know from Problem 5-3 that (4, 0) and (0, 5) are equivalent 
to their duals, so we conclude that y transforms like a Dirac spinor as well. 


3 Also, note the analogy between (6.21) and (6.14); in fact, the Pauli matrices can be thought of as a 
lower-dimensional analog of the gamma matrices, and it’s no coincidence that the gamma matrices 
y; in (6.19) are built out of the Pauli matrices! For more on this see Frankel [6]. 
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With y in hand, we can now define the Dirac bilinears 


y y scalar 

y Vuy vector 

y Yu, H xv antisymmetric 2-tensor 
WYuVvYoW, LU Av#p  pseudovector 

Pysy pseudoscalar 


Each of these contains a product of two Dirac spinors, and can be seen as 
components of tensors living in C4 @ C4. Note the similarities between the names 
of the Dirac bilinears and the first five entries of the table from Example 5.41. This 
makes it seem like the Dirac bilinears should transform like the components of 
antisymmetric tensors (of ranks 0 through 4). Is this true? Well, using (6.22), we find 
that the scalar transforms like 


Wr YDA) D(A) = vy 


and so really does transform like a scalar. Similarly, using (6.20), we find that the 
vector transforms like 


VW ==? W D(A) 'y, D(A) 
_ —l vy 
a A H Pyy 
and so really does transform like a vector. Using the anticommutation rela- 
tion (6.21), you can similarly verify that the antisymmetric 2-tensor, pseudovector, 


and pseudoscalar transform like antisymmetric tensors of ranks 2, 3, and 4 respec- 
tively. If we let A* (IR*) denote the set of all antisymmetric tensor products of R4, i.e. 


4 
A*R‘ = QB AFR‘, 
k=0 


and let A*Re denote its complexification, then C4 @ C* contains a subspace 
equivalent to A*Ré. However, one can check that both spaces have (complex) 
dimension 16, so as sl(2, C)g representations 


Ct 8 Ct ~ A*Ré. 


In other words. the tensor product of the Dirac spinor representation with itself 
is equivalent to the space of all antisymmetric tensors! 
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Another way to obtain this same result is to use our tensor product decomposi- 
tion (5.76). The tensor product C4 ® C4 of two Dirac spinors is given by 


[(3-0) @ (0, 5)] @ [(3.0) @ (0. 3)] = 
[(3.) 8 (3,0)] © [(z.0) @ (0, 5)] © [(0, 3) @ (3,0)] e [(0. 5) $ (0. 3)] 
= (1,0) © (0,0) ® (3, 3) ® (3,3) ® ©. D) 6 (0,0) 
= (0,0) ® (5.4) (1,0) 6 (0,1) @ (4, 4) ® (0,0) (6.23) 


where in the first equality we used the fact that the tensor product distributes over 
direct sums (see Problem 5-12), in the second equality we used (5.76), and in the 
third equality we just rearranged the summands. However, from Example 5.41 we 
know that as an s0(3, 1) representation, A*RE decomposes as 


A*Ré > (0,0) & (4, 4) @ (1,0) 6 (0,1) @ (4, 4) @ (0,0) 


which is just (6.23)! 


Exercise 6.6. Verify (6.22). Also verify that the antisymmetric 2-tensor, pseudovector and 
pseudoscalar bilinears transform like the components of antisymmetric tensors of rank 2, 3, 
and 4. 


Chapter 6 Problems 


“eye ” 


Note: Problems marked with an tend to be longer, and/or more difficult, and/or 
more geared towards the completion of proofs and the tying up of loose ends. 
Though these problems are still worthwhile, they can be skipped on a first reading. 


6-1. In this problem we’ll establish a couple of the basic properties of the 
orthogonal projection operator. To this end, let H be a Hilbert space with 
inner product (-|-) and let W be a finite-dimensional subspace of H. 


(a) Show that for any v € H, there exists a unique vector P(v) € W such 
that 


(P(v)|W) = wlw) Vw ew. 


This defines the orthogonal projection map P : H —> W which 
projects H onto W. (Hint: there are a few ways to show that P(v) 
exists and is unique. One route is to consider the map L : H > H* 
given by L(v) = (v|-) and then play around with restrictions to W.) 
(b) Quickly show that P(w) = w for all w € W, and hence that P? = P. 


Chapter 6 Problems 


(c) Let {e;} be a (possibly infinite) orthonormal basis for H where the first 
k vectors e;, i = 1,...,k are a basis for W. Show that if we expand 
an arbitrary v € H as v = v'e; where the implied sum is over all i 
(and where the sum may be infinite), then P takes the simple form 


k 
P(v) = P(vie;) = X viei. 


i=l 


Thus P can be thought of as “projecting out all the components 
orthogonal to W.” 

(d) Let {e;7} be an arbitrary orthonormal basis for H. Show that in this 
case the action of P is given by 


k 
Pe) = > Glee. 


i=l 


6-2. Let (TI, H) be a unitary representation of a group G on a Hilbert space H. 
In this problem we’ll show that inequivalent irreducible subspaces of H are 
orthogonal. 


(a) Let W C H be a finite-dimensional irreducible subspace and P : 
H — W the orthogonal projection operator onto W. Use the defining 
property of P, the unitarity of II, and the invariance of W to show that 
P is an intertwiner. 

(b) Let V C H be another irreducible subspace inequivalent to W . Restrict 
P to V to get Ply : V — W which is still an intertwiner. Use 
Schur’s lemma to deduce that P|y = 0, and conclude that V and W 
are orthogonal. 
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Appendix A 

Complexifications of Real Lie Algebras 
and the Tensor Product Decomposition 
of sl(2, C)r Representations 


The goal of this appendix is to prove Proposition 5.9 about the tensor product 
decomposition of two sI(2, C)p representations. The proof is long but will introduce 
some useful notions, like the direct sum and complexification of a Lie algebra. 
We’ll use these notions to show that the representations of s{(2,C)p are in 1-1 
correspondence with certain representations of the complex Lie algebra sI(2, C) ® 
sI(2, C). That this complex Lie algebra is a direct sum will imply certain properties 
about its representations, which in turn will allow us to prove Proposition 5.9. 


A.1 Direct Sums and Complexifications of Lie Algebras 


In this text we have dealt only with real Lie algebras, as that is the case of greatest 
interest for physicists. From a more mathematical point of view, however, it actually 
simplifies matters to focus on the complex case, and we will need that approach 
to prove Proposition 5.9. With that in mind, we make the following definition (in 
total analogy to the real case): A complex Lie algebra is a complex vector space g 
equipped with a complex-linear Lie bracket [-,-] : g x g — g which satisfies the 
usual axioms of antisymmetry 


[X,Y] = -[Y, X] Y X,Y €g, 
and the Jacobi identity 
[X,Y], Z] + [[Y, Z], X] + [[Z, X], Y] =0 Y X,Y,Z €g. 
Examples of complex Lie algebras are M, (C) = gl(u,C), the set of all complex 


n Xn matrices, and sl(n, C), the set of all complex, traceless n x n matrices. In both 
cases the bracket is just given by the commutator of matrices. 
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For our application we’ll be interested in turning real Lie algebras into complex 
Lie algebras. We already know how to complexify vector spaces, so to turn a real Lie 
algebra into a complex one we just have to extend the Lie bracket to the complexified 
vector space. This is done in the obvious way: given a real Lie algebra g with bracket 
L, -], we define its complexification to be the complexified vector space gc = C8 g 
with Lie bracket [-, -]c defined by 


[1@ Xi: +1 @X1,1@%4+i1 8N] = 18 [X Y] - 18 [%, Y] 
+i 8 [X n] +i 8 [X Y] 


where X;, Y; € g, i = 1,2. If we abbreviate i & X as iX and 1 & X as X, this 
tidies up and becomes 


[Xi + iX, Yi + iY]c = [X1, Yi] — [X2, Yo] + i (X1, Y2] + [X2, V1). 


This formula defines [-,-]c in terms of [-,-], and is also exactly what you’d get by 
naively using complex linearity to expand the left-hand side. 
What does this process yield in familiar cases? For su(2) we define a map 


b : su(2)c > sl(2,C) 
(A.1) 


iO 


lexi +i@xr K+ (57) X X,, X2 E€ su(2). 


You will show below that this is a Lie algebra isomorphism, and hence su(2) 
complexifies to become sl(2,C). You will also use similar maps to show that 
u(n)cxgl(n, R)cxgl(n, C). If we complexify the real Lie algebra sI(2,C)p, we 
also get something nice. The complexified Lie algebra (sI(2,C)p)c has complex 
basis 


Mi 
N = 


Nie NIE 


(1@S;-i@K;), i=1,2,3 


: (A.2) 
(1@S;+i@Ki), i=1,2,3 


which we can again abbreviate! as 


M; 
N = 


'Careful here! When we write i Ki, this is not to be interpreted as 7 times the matrix K;, as this 
would make the M; identically zero (check!); it is merely shorthand for i ® K;. 
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These expressions are very similar to ones found in our discussion of sl(2, C)g 
representations, and using the bracket on (sI(2,C)r)c one can verify that the 
analogs of Eq. (5.72) hold, i.e. that 


[M;, N;]c = 0 (A.3) 


3 
Mi, Milo = >> €ijeMe 


k=1 


3 
IN. Nile = >. Eijk Nk. 


k=1 


Notice that both Span{M;} and Span{N;} (over the complex numbers) are Lie 
subalgebras of (sI(2, C)r)c and that both are isomorphic to st(2, C), since 


Span{M;} ~ Span{N;} > su(2)c > sI(2, 0). (A.4) 


Furthermore, the bracket between an element of Span{M,;} and an element of 
Span{N;} is 0, by (A.3). Also, as a (complex) vector space (sI(2, C)r)c is the direct 
sum Span{M;}@ Span{N;}. When a Lie algebra g can be written as a direct sum of 
subspaces W; and W2, where the W; are each subalgebras and [w1, w2] = 0 for all 
w, E€ Wi, w2 E€ Wh, we say that the original Lie algebra g is a Lie algebra direct 
sum of W; and Wh, and we write g = Wi @ W,.2 Thus, we have the Lie algebra 
direct sum decomposition 


(sl(2, C)r)cxsl(2,C) @ sl(2, C). (A.5) 


This decomposition will be crucial in our proof of Proposition 5.9. 


Exercise A.1. Prove that (A.1) is a Lie algebra isomorphism. Remember, this consists of 
showing that @ is a vector space isomorphism, and then showing that ¢ preserves brackets. 
Then find similar Lie algebra isomorphisms to prove that 


u(n)c = gl(n, ©) 
gl(n, R)c = gl(n,C). 


Notice that this notation is ambiguous, since it could mean either that g is the direct sum of W, and 
W as vector spaces (which would then tell you nothing about how the direct sum decomposition 
interacts with the Lie bracket), or it could mean Lie algebra direct sum. We’ll be explicit if there is 
any possibility of confusion. 
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A.2 Representations of Complexified Lie Algebras 
and the Tensor Product Decomposition of sI(2, C)p 
Representations 


In order for (A.5) to be of any use, we must know how the representations of a 
real Lie algebra relate to the representations of its complexification. First off, we 
should clarify that when we speak of a representation of a complex Lie algebra g we 
are ignoring the complex vector space structure of g; in particular, the Lie algebra 
homomorphism x : g — gl(V) is only required to be real-linear, in the sense 
that (cX) = cx(X) for all real numbers c. If g is a complex Lie algebra, the 
representation space V is complex, and m(cX) = cx(X) for all complex numbers 
c, then we say that x is complex-linear. Not all complex representations of complex 
Lie algebras are complex-linear. For instance, all of the sl(2,C)g representations 
described in Sect.5.11 can be thought of as representations of the complex Lie 
algebra sI(2, C), but only those of the form (j,0) are complex-linear, as you will 
show below. 


Exercise A.2. Consider all the representations of s{(2,C)p as representations of sl(2, C) 
as well. Show directly that the fundamental representation (3, 0) of s{(2,C) is complex- 
linear, and use this to prove that all s{(2, C) representations of the form (j, 0) are complex- 
linear. Furthermore, by considering the operators N; defined in (5.71), show that these are 
the only complex-linear representations of s!(2, C). 


Now let g be a real Lie algebra and let (x, V) be a complex representation 
of g. Then we can extend (z,V) to a complex-linear representation of the 
complexification gc in the obvious way, by setting 


mw(X; + iX2) = 1(X\) + im(X2) X1,X2 €g. 


(Notice that this representation is complex-linear by definition, and that the operator 
im(X2) is only well defined because V is a complex vector space.) Furthermore, 
this extension operation is reversible: that is, given a complex-linear representation 
(x, V) of gc, we can get a representation of g by simply restricting 2 to the subspace 
{1@ X +i @0|X €g} > g, and this restriction reverses the extension just defined. 
Furthermore, you will show below that (x, V) is an irrep of gc if and only if it 
corresponds to an irrep of g. We thus have 


Proposition A.1. The irreducible complex representations of a real Lie algebra g 
are in one-to-one correspondence with the irreducible complex-linear representa- 
tions of its complexification gc. 


This means that we can identify the irreducible complex representations of g with 
the irreducible complex-linear representations of gc, and we will freely make this 
identification from now on. Note the contrast between what we’re doing here and 
what we did in Sect. 5.10; there, we complexified real representations to get complex 
representations which we could then classify; here, the representation space is fixed 


A Complexifications of Real Lie Algebras and the Tensor Product... 293 


(and is always complex!) and we are complexifying the Lie algebra itself, to get a 
representation of a complex Lie algebra on the same representation space we started 
with. 


Exercise A.3. Let (7, V) be a complex representation of a real Lie algebra g, and extend it 
to a complex-linear representation of gc. Show that (z, V) is irreducible as a representation 
of g if and only if it is irreducible as a representation of gc. 


Example A.1. The complex-linear irreducible representations of s\(2, C) 


As a first application of Proposition A.1, consider the complex Lie algebra sI(2, C). 
Since slI(2, C)~su(2)c, we conclude that its complex-linear irreps are just the irreps 
(z;,V;) of su(2)! In fact, you can easily show directly that the complex-linear 
s((2, C) representation corresponding to (7r; , V;) is just (j, 0). Oo 


As a second application, note that by (A.5) the complex-linear irreps of 
sI(2, C) @ sl(2,C) are just the representations (j1,j2) coming from sl(2,C)p. 
Since sl(2,C) ® sl(2, ©) is a direct sum, however, there is another way to construct 
complex-linear irreps. Take two complex-linear irreps of s{(2,C), say (xj, Vj) 
and (xj, Vj). We can take a modified tensor product of these representations such 
that the resulting representation is not of s{(2,C) but rather of s{(2,C) @ sl(2, ©). 
We denote this representation by (tj 7j, Vj, ® V;,) and define it by 


(1 j, OX, )(X1, X2) = m (X)Ol+1 @mj,(X2) € LV @V2), Xi, Xo € sl(2,C). 

(A.6) 
Note that we have written the tensor product in 7; Qn ja as “&” rather than “g”; 
this is to distinguish this tensor product of representations from the tensor product 
of representations defined in Sect.5.4. In the earlier definition, we took a tensor 
product of two g representations and produced a third g representation given by 
(mı ® m2)(X) = 1 (X) Q I + I Q m(X), where the same element X € g gets 
fed into both 7z; and 72; here, we take a tensor product of two g representations and 
produce a representation of g ® g, where two different elements X1, X2 € g get fed 
into xı and mm. 

Now, one might wonder if the representation (7j,@z7j,,Vj;,®Vj,) of 
sI(2, C) @ sl(2, C) defined above is equivalent to (jı, j2); this is in fact the case! 
To prove this, recall the following notation: the representation space V; @ Vj, is 
spanned by vectors of the form vz, ® vk, ki = 0,...,2j7;, i = 1,2 where the 
vk; are characterized by (5.65). Similarly, the representation space V(;,,j.) of (ja, j2) 
is spanned by vectors of the form vz, 4%,, ki = 0,...,2j7;, i = 1,2, where these 
vectors are characterized by (5.74). We can thus define the obvious intertwiner 


o: Vj @ Vin > Varja 


Vk, ® Vk, > Vki,ka- 
Of course, we must check that this map actually is an intertwiner, i.e. that 


$ © (TDT) (X1, X] = Toja) (X1, X2) 0b Y Xi, X2 € 5l(2,C). (A.7) 
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Since s{(2, C) @ sl(2, C) is of (complex) dimension six, it suffices to check this for 
the six basis vectors 
GM,, 0), 
(0, iN), 
(M+, 0) = (Mx = My, 0), 
(M_,0) = Mx + M,,0), 
(0, N+) = (0, iN — Ny), 
(0, N_) = (0, iNx + M5), 
where the M; and M; were defined in (A.2). We now check (A.7) for (i M,, 0) 
on an arbitrary basis vector vg, ® Vvk , and leave the verification for the other five 
sI(2, C) @ sl(2, C) basis vectors to you. The left hand of (A.7) gives (careful with 
all the parentheses!) 
p(T Oj) M, 0) (VE, ® Vky)) = VEA & I) (vk & Uk )) 
= oi = ky)vx, ® Vka) 
= (ji — ki) Vki ka 
where in the first equality we identified M, with S, € su(2)c as per (A.4). 
Meanwhile, viewing M, as an element of (sI(2, C)r)c, the right-hand side of (A.7) 
is 
Tiji ja) Mz, O) Wri 8 Ve)) = Acija) Mz, 0)) Wki k) 


i M; Vki ko 


II 


II 


(ji m k1) Vki ko 


and so the two sides agree. The verification for the other five s{(2,C) @ sl(2, ©) 
basis vectors proceeds similarly. This proves that 


Proposition A.2. Let (j1,j2) denote both the usual s(2,C)p irrep and its 
extension to a complex-linear irrep of (s\(2,C)r)c > sl(2,C) ® sl(2,C). Let 
(TART, Vj, ® V;,) be the representation of s\(2,C) ® sl(2,C) defined in (A.6). 
Then 


(1,52) = (7), 8T, Vj, ® Vh). (A.8) 


Exercise A.4. Verify (A.7) for the other five sl(2,C) ® sI(2, C) basis vectors, and thus 
complete the proof of Proposition A.2. 
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We will now use Proposition A.2 to compute tensor products of the (sI(2, C)r)c 
irreps (jı, j2), which by Proposition A.1 will give us the tensor products of the 
(jı, j2) as sI(2, C)p irreps, which was what we wanted! In the following computation 
we will use the fact that 


(Ti ®j)) ® (Tk Dak) X (Tj, ® Tk, )@(I js ® Ik), (A.9) 


which you will prove below (note which tensor product symbols are “barred” 
and which are not). With this in hand, and using the su(2)c~sl(2,C) tensor 
product decomposition (5.56), we have (omitting the representation spaces in the 
computation) 


(j1.j2) ® (ki, K2) & (ABT) ® (Te, Tk) 
X (Tj Q We, (Hj Ok) 


ji+kı j2+k2 
~| @ mj Q m 
h =|jı—kıl h=|j2—k2| 


= a>) m&m, where |ji—ki| <li < ji + ki, i = 1,2 
(1/2) 


ad an (l, l2) where |j; —k;| <li < ji + ki, i = 1,2. 
(h,l2) 


This gives the tensor product decomposition of complex-linear (sI(2, C)r)c irreps. 
However, these irreps are just the extensions of the irreps (j1, j2) of sl(2, C)p, and 
you can check that the process of extending an irrep to the complexification of a Lie 
algebra commutes with taking tensor products, so that the extension of a product is 
the product of an extension. From this we conclude that 


Proposition A.3. The decomposition into irreps of the tensor product of two 
sl(2, C)p irreps (jı, j2) and (ky, K2) is given by 


(ja. j2) Q (kı, k2) = GB (1,12) where | ji — ki | 


|j2 — kal 


IA 


l< ji+kı, 


IA 


hb < jo thy 


which is just Proposition 5.9. 


Exercise A.5. Prove (A.9) by referring to the definitions of both kinds of tensor product 
representations and by evaluating both sides of the equation on an arbitrary vector 


(X1, X2) € sl(2, ©) @ sl(2, C). 
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Orientation, 95, 105 
Orthogonal 
complement, 267 
matrices, 63, 102 
projection operator, 277, 286 
set, 35 
Orthogonal group, 119, 125-127 
proof that it’s a matrix Lie group, 146 
Orthonormal basis, 34 
for hilbert space, 37 


P 
Parity, 132, 210, 242 
and Dirac spinors, 237 
and Z2, 142 
Passive transformation, 66, 103 
Pauli matrices, 14 
Permutation(s). See also Permutation group 
even, odd, 89, 144 
Permutation group, 123, 197 
relation to Z2, 143 
Permutation operator, 90 
Phase space, 167 
Pin groups, 263 
Poisson bracket, 167 
formulation of mechanics, 208 
Polar decomposition theorem, 182 
Polynomials, 15 
harmonic (see Harmonic polynomials) 
Hermite, 39 
Laguerre, 50 
Legendre, 39 
real, 38, 50 
Positive-definite, 34 
Positive matrix, 182 
Principal axes, 62 
Principal minors, 183 
Principal moments of inertia, 62 
Product 
of matrices, 28 
state, 85 
Pseudoscalar, 218, 219 
Pseudovector, 96, 196, 218, 219, 234 


Q 
Quark, antiquark, 225 
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R 
Rank (of map), 138 
Rank (of tensor), 3, 54 
Rank-nullity theorem, 138 
Rapidity, 130 
Real-linear, 292 
Real numbers (as additive group), 117 
Real vector space, 13 
Representation, 167, 187, 188, 192 
adjoint, 195 
adjoint of C(P), 207 
alternating, 197 
complex, 192 
conjugate, 265 
dual, 211, 212, 225 
equivalence of, 220 
faithful, 266 
four-vector, 193, 199, 257 
on function spaces, 199 
fundamental, 193 
of Heisenberg algebra on L? (R), 206 
irreducible, 232 
on linear operators, 214 
pseudoscalar, 219 
pseudovector, 219 
real, 192 
of S, , 197 
scalar, 217 
2nd rank antisymmetric tensor, 196, 227, 
258 
2nd rank antisymmetric tensor and adjoint 
of O(n), 217 
sgn, 197 
spin-one, 190, 250 
SO(2) on R?, 241, 250 
SO(3) on Hı (R?) Hy and L? (S?), 205 
SO(3)on L? (R°), 202 
space, 192 
spinor, 194, 199, 226, 253, 254, 261, 
282 
spin s, 190, 200, 210 
spin-two, 250 
SU(2) on P;(C?), 200 
on symmetric and antisymmetric tensors, 
215-220 
symmetric traceless tensors, 250 
on 7,’ (V), 211 
tensor product, 208 
trivial, 193 
unitary, 188, 192 
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Representation, (cont.) form, 34 
vector, 193 matrices, 19 
and “spin-one”, 249 tensors, 85 
of Z2, 197 Symmetric group. See Permutation group 
Representation operator, 276 Symmetrization postulate, 90, 145, 
Right-handed spinor. See Representation, 197 
spinor Symmetry and degeneracy, 241 


Rigid body motion, 23, 68, 100 

Rotation, 95, 111, 125-126, 168, 180 
generators, 112, 113, 151, 153, 158, 160 
improper, 127 


T 
proper, 127 IT], 28 
T? . See Tensors, of type (r, s) 
S Tensor operator, 272, 273 
Scalar representation. See Representation, Tensor product 
scalar as addition of degrees of freedom, 81-85, 

Scalars, 12 209 
Schouten convention, 58 of operators, 80 
Schrodinger picture, 68 representation, 208 
Schur’s lemma, 239 of sI(2, C)p irreps, 257, 295 
Selection rule, 277 of su(2) irreps, 234 

angular momentum, 275, 279 of vectors, 70 

parity, 279 of vector spaces, 70 
Self-adjoint, 49 Tensors 
Semi-simple (group or algebra), 233 alternating, 88 
Separable state, 85 antisymmetric, 88 
sgn homomorphism, 143 basis for vector space of, 72 
sgn representation, 197 components of, 3, 6, 7, 53, 72 
Sigma matrices. See Pauli matrices contraction of, 73 
Similarity transformation, 63, 214 definition of, 52 
Space axes, 23 linear operators, 8, 52, 74 
Space frame, 99 and matrices, 8 
Span (of a set of vectors), 17 rank of, 54 
Special linear group, 134 symmetric, 85 
Special orthogonal group, 122, 180 as tensor product space, 72 
Special unitary group, 122, 127 transformation law, 8, 57—69 
Spectral theorem, 206 type (r, s), 52, 72 
Spherical harmonics, 15, 20, 29, 46, 191,205, | Time-reversal, 132 

235 and Zo, 142 

Spherical Laplacian, 16, 206 Trace, 32, 63 
Spherical tensor, 274 cyclic property of, 156 
Spin, 13, 82, 189 and determinant, 180 
Spin angular momentum, 83 interpretation of, 154 
Spinor. See Representation, spinor Transformation law 
Square-integrable, 15, 205 of linear operators, 63, 214 
Star operator, 269 of metric tensors, 64 
Stern-Gerlach experiment, 24 of vectors and dual vectors, 60, 
Stone-von Neumann theorem, 189 212 
Structure constants, 171 Translations, 96, 170 
Subgroup, 118 Transpose, 48 

normal, 137 Transposition, 85, 143 
Subspace, 14 Trivial representation. See Representation, 


Symmetric trivial 
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U 

Unitary 
group, 120 
matrices, 63 
operator, 120 
representation, 192 


V 
Vector operators, 80, 271-272 
Vector representation. See Representation, 
vector 
Vector space 
as additive group, 118 
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axioms, 12 
complex, 13 
definition of, 12 
isomorphism, 136 
real, 13 


W 

Wedge product, 88, 144 

Weight function, 39 

Weyl spinor, 194 

Wigner D-matrix, 275 

Wigner-Eckart theorem, 275, 279-282 
Wigner function, 275 


